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HlSTONE DEACETYLASE - RELATED GENE AND PROTEIN 

This invention relates to a histone deacetylase gene and gene product. In particular, the 
invention relates to a protein that is highly homologous to known mammalian histone deacetylases 
(HDACs), nucleic acid molecules that encode such a protein, antibodies that recognize the protein, 
and methods of use which include assays screening for modulators of HD AC activity and for 
diagnosing conditions related to abnormal HDAC activity, including, for example, abnormal cell 
proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or immune 
response or psoriasis. 

Background 

Histone acetylation is a major regulatory mechanism that modulates gene expression by 
altering the accessibility of transcription factors to DNA. Acetylation of histones is a reversible 
modification of the free e-amino group of lysine that occurs during the assembly of nucleosomes and 
during DNA synthesis. 

HDACs have been shown to play an important role in the regulation of transcription. HDACs 
function as components of complexes that are involved in transcriptional repression. This is mediated 
through interactions of HDACs with multi-protein complexes and requires deacetylase activity. 
Changes in histone acetylation levels also occur during transcriptional activation and silencing. 
Acetylation of histones is generally associated with transcriptional activity, whereas deacetylation is 
associated with transcriptional repression. 

HDAC complexes may contain the co-repressor mSin3A and mSin3 A-associated proteins, 
silencing mediators NcoR and SMRT, transcriptional repressors, Rb-like proteins pi 07 and pi 30, Rb- 
associated proteins, nuclear hormone receptors, nucleosome remodeling factors, methyl-binding 
proteins, DNA repair machinery proteins, and the like. Furthermore, HDAC1 has been found to bind 
directly to YY1 and Spl and HDACs 4 and 5 bind to MEF2. In addition, HDACs have been found 
together in complexes. 

Two distinct classes of yeast histone deacetylases have been identified based upon size and 
sequence. Yeast class I HDACs include Rpd3, Hoslp, and Hos2p. Class II contains yeast HDAlp. 
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Furthermore, members of these two classes were found to form different complexes. Human HDACs 
have been classified based upon their similarity to yeast sequences. Class I human HDACs include 
HDACsl-3 and 8. Class II HDACs include HDACs 4-7. The deacetylase core of class I HDACs 
reside in the first -390 amino acids. Class II HDAC catalytic domains are located in the C-terminal of 
these peptides, with the exception of HDAC6 that contains a second catalytic domain in the N- 
terminus. Here we report the isolation and characterization of a new HDAC, referred to herein as 
HDAC10. 

An important approach that has been used to study the function of chromatin acetylation is the 
use of specific inhibitors of histone deacetylase. Several classes of compounds have been identified 
that inhibit HDAC. Histone deacetylase inhibitors have been found to have anti-proliferative effects, 
including induction of Gl/S and G2/M cell cycle arrest, differentiation and apoptosis of transformed 
and normal cells and reversal of transformation. These effects, along with the presence of HDAC in 
complexes with fusions of unliganded retinoic acid receptors PML-RARa and PLZF-RARa indicate a 
role for HDACs in tumorigenicity. Furthermore, histone deacetylase inhibitors, phenylbutyrate and 
trichostatin A have shown promise in the treatment of promyelocytic leukemia and several other 
HDAC inhibitors are being studied as treatments for cancers. 

Summary of the invention 

The present invention relates to a novel histone deacetylase designated HDAC10. 

In a first aspect, the invention provides an isolated polypeptide comprising an amino acid 
sequence as set forth in SEQ ID NO:l. Furthermore, the invention provides an isolated polypeptide 
consisting of an amino acid sequence as set forth in SEQ ID NO: 1. The amino acid sequence as set 
forth in SEQ ID NO:l shows a considerable degree of homology to that of known members of the 
family of HDACs in the catalytic domain. For convenience, the polypeptide consisting of the amino 
acid sequence as set forth in SEQ ID NO:l will be designated as histone deacetylase 10 or HDAC10. 
Fragments of the isolated polypeptide having an amino acid sequence as set forth in SEQ ID NO:l 
also form a part of the present invention. Preferably, fragments will encompass the catalytic domain, 
which is predicted to exist between amino acid number 15 to.323. In accordance with this aspect of 
the invention there are provided novel polypeptides of human origin as well as biologically, 
diagnostically or therapeutically useful fragments, variants and derivatives thereof, variants and 
derivatives of the fragments, and analogs of the foregoing. 
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In a second aspect, the invention provides an isolated DNA comprising a nucleotide sequence 
that encodes a polypeptide as mentioned above. In particular, the invention provides (1) an isolated 
DNA comprising the nucleotide sequence as set forth in SEQ ID NO:2; (2) an isolated DNA 
comprising the nucleotide sequence set forth in SEQ ID NO:3; (3) an isolated DNA capable of 
hybridizing under high stringency conditions to the nucleotide sequence set forth in SEQ ID NO:2; 
and (4) an isolated DNA comprising the nucleotide sequence set forth in SEQ ID NO:4. Also 
provided are nucleic acid sequences comprising at least about 15 bases, preferably at least about 20 
bases, more preferably a nucleic acid sequence comprising about 30 contiguous bases of SEQ ID 
NO:2 or SEQ ID NO:3. Also within the scope of the present invention are nucleic acids that are 
substantially similar to the nucleic acid with the nucleotide sequence as set forth in SEQ ID NO:2 or 
SEQ ID NO:3. In a preferred embodiment, the isolated DNA takes the form of a vector molecule 
comprising at least a fragment of a DNA of the present invention, in particular comprising the DNA 
consisting of a nucleotide sequence as set forth in SEQ ID NO:2 or SEQ ID NO:3. 

A third aspect of the present invention encompasses a method for the diagnosis of conditions 
associated with abnormal regulation of gene expression which includes, but is not limited to, 
conditions associated with abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel 
disease, or psoriasis in a human which comprises detecting abnormal transcription of messenger RNA 
transcribed from the natural endogenous human gene encoding the novel polypeptide consisting of the 
amino acid sequence set forth in SEQ ID NO: 1 in an appropriate tissue or cell from a human, wherein 
such abnormal transcription is diagnostic of the humans affliction with such a condition. In 
particular, the said natural endogenous human gene encoding the novel polypeptide consisting of the 
amino acid sequence set forth in SEQ ID NO:l comprises the genomic nucleotide sequence set forth 
in SEQ ID NO:4. In one embodiment of the present invention, the diagnostic method comprises 
contacting a sample of said appropriate tissue or cell or contacting an isolated RNA or DNA molecule 
derived from that tissue or cell with an isolated nucleotide sequence of at least about 15-20 
nucleotides in length that hybridizes under high stringency conditions with the isolated nucleotide 
sequence encoding the novel polypeptide having an amino acid sequence set forth in SEQ ID NO:l. 

Another embodiment of the assay aspect of the invention provides a method for the diagnosis 
of a condition associated with abnormal HDAC10 activity in a human, which comprises measuring 
the level of deacetylase activity in a certain tissue or cell from a human suffering from such a 
condition, wherein the presence of an abnormal level of deacetylase activity, relative to the level 



WO 03/014340 



-4. 



PCT/EP02/08654 



thereof in the respective tissue or cell of a human not suffering from a condition associated with 
abnormal HDAC activity, is diagnostic of the human's suffering from said condition. 

In accordance with one embodiment of this aspect of the invention there are provided anti- 
sense polynucleotides that can regulate transcription of the gene encoding the novel HDAC 10; in 
another embodiment, double stranded RNA is provided that can regulate the transcription of the gene 
encoding the novel HDAC 10. 

Another aspect of the invention provides a process for producing the aforementioned 
polypeptides, polypeptide fragments, variants and derivatives, fragments of the variants and 
derivatives, and analogs of the foregoing. In a preferred embodiment of this aspect of the invention 
there are provided methods for producing the aforementioned HDAC 10 comprising culturing host 
cells having incorporated therein an expression vector containing an exogenously-derived nucleotide 
sequence encoding such a polynucleotide under conditions sufficient for expression of the 
polypeptide in the host cell, thereby causing expression of the polypeptide, and optionally recovering 
the expressed polypeptide. In a preferred embodiment of this aspect of the present invention, there is 
provided a method for producing polypeptides comprising or consisting of an amino acid sequence as 
set forth in SEQ ID NO:l, which comprises culturing a host cell having incorporated therein an 
expression vector containing an exogenously-derived polynucleotide encoding a polypeptide 
comprising or consisting of an amino acid sequence as set forth in SEQ ID NO:l, under conditions 
sufficient for expression of such a polypeptide in the host cell, thereby causing the production of an 
expressed polypeptide, and optionally recovering the expressed polypeptide. Preferably, in any of 
such methods the exogenously derived polynucleotide comprises or consists of the nucleotide 
sequence set forth in SEQ ID NO:2, the nucleotide sequence set forth in SEQ ID NO:3, or the 
nucleotide sequence set forth in SEQ ID NO:4. In accordance with another aspect of the invention 
there are provided products, compositions, processes and methods that utilize the aforementioned 
polypeptides and polynucleotides for, inter alia, research, biological, clinical and therapeutic 
purposes. 

In certain additional preferred embodiments of this aspect of the invention there is provided 
an antibody or a fragment thereof which specifically binds to a polypeptide that comprises the amino 
acid sequence set forth in SEQ ID NO:l, i.e., HDAC10. In certain particularly preferred 
embodiments in this regard, the antibodies are highly selective for human HDAC 10 polypeptides or 
portions of human HDAC 10 polypeptides. 
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In a further aspect, an antibody or fragment thereof is provided that binds to a fragment or 
portion of the amino acid sequence set forth in SEQ ID NO: 1 . 

In another aspect, methods of treating a condition in a subject, wherein the condition is 
associated with abnormal HDAC10 gene expression, an increase or decrease in the presence of 
HDAC10 polypeptide in a subject, or an increase or decrease in the activity of HDAC10 polypeptide, 
by the administration of an effective amount of an antibody that binds to a polypeptide with the amino 
acid sequence set out in SEQ ID NO: 1, or a fragment or portion thereof to the subject are provided. 
Also provided are methods for the diagnosis of a disease or condition associated with abnormal 
HDAC10 gene expression or an increase or decrease in the presence of the HDAC10 in a subject, or 
an increase or decrease in the activity of HDAC10 polypeptide. 

In yet another aspect, the invention provides host cells which can be propagated in vitro, 
preferably vertebrate cells, in particular mammalian cells, or bacterial cells, which are capable upon 
growth in culture of producing a polypeptide that comprises the amino acid sequence set forth in SEQ 
ID NO:l or fragments thereof, where the cells contain transcriptional control DNA sequences, where 
the transcriptional control sequences control transcription of RNA encoding a polypeptide with the 
amino acid sequence according to SEQ ID NO:l or fragments thereof. This includes, but is not limited 
to, the propagation of HDAC10 in a plasmid and the production of DNA, RNA or protein in human or 
insect cells or bacteria using the endogenous HDAC10 promoter or any other transcriptional control 
sequence. 

In yet another aspect of the present invention there are provided assay methods and kits 
comprising the components necessary to detect above-normal expression of polynucleotides encoding 
a polypeptide comprising an amino acid sequence as set forth in SEQ ID NO: 1, or polypeptides 
comprising an amino acid sequence set forth in SEQ ID NO:l, or fragments thereof, in body tissue 
samples derived from a patient, such kits comprising e.g., antibodies that bind to a polypeptide 
comprising an amino acid sequence set forth in SEQ ID NO:l , or to fragments thereof, or 
oligonucleotide probes that hybridize with polynucleotides of the invention. In a preferred 
embodiment, such kits also comprise instructions detailing the procedures by which the kit 
components are to be used. 
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In another aspect, the invention is directed to use of a polypeptide comprising an amino acid 
sequence set forth in SEQ ID NO:l or fragment thereof, polynucleotide encoding such a polypeptide 
or a fragment thereof, or antibody that binds to said polypeptide comprising an amino acid sequence 
set forth in SEQ ID NO: 1 or a fragment thereof in the manufacture of a medicament to treat diseases 
associated with abnormal HDAC activity or gene expression. 

Another aspect is directed to pharmaceutical compositions comprising a polypeptide 
comprising or consisting of an amino acid sequence set forth in SEQ ID NO: 1 or fragment thereof, a 
polynucleotide encoding such a polypeptide or a fragment thereof, or antibody that binds to such a 
polypeptide or a fragment thereof, in conjunction with a suitable pharmaceutical carrier, excipient or 
diluent, for the treatment of diseases associated with abnormal HDAC activity or gene expression. 

In another aspect, the invention is directed to methods for the identification of molecules that 
can bind to a polypeptide comprising an amino acid sequence set forth in SEQ ID NO:l and/or 
modulate the activity of a polypeptide comprising an amino acid sequence set forth in SEQ ID NO: 1 
or molecules that can bind to nucleic acid sequences that modulate the transcription or translation of a 
polynucleotide encoding a polypeptide comprising an amino acid sequence set forth in SEQ ID NO:l. 
Molecules identified by such methods also fall within the scope of the present invention. 

In a related aspect, the invention is directed to use of the novel HDAC10 to identify 
associated proteins in HDAC biologically relevant complexes. At present, the proteins that associate 
with HDAC10 are not known. However, these may be characterized by determining whether 
HDAC 10 associates with proteins that have been previously shown to interact with other HDACs (see 
Introduction). For example, components of HDAC 10 complexes may be determined using 
conventional methods, including co-immunoprecipitation. 

In yet another aspect, the invention is directed to methods for the introduction of nucleic acids 
of the invention into one or more tissues of a subject in need of treatment with the result that one or 
more proteins encoded by the nucleic acids are expressed and or secreted by cells within the tissue. 

Other objects, features, advantages and aspects of the present invention will become apparent 
to those of skill from the following description. It should be understood, however, that the following 
description and the specific examples, while indicating preferred embodiments of the invention, are 
given by way of illustration only. Various changes and modifications within the spirit and scope of 
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the disclosed invention will become readily apparent to those skilled in the art from reading the 
following description and from reading the other parts of the present disclosure. 

Brief description of the Drawings 

Figure 1 shows amino acid sequence (SEQ ID NO:l) of HDAC10. 

Figure 2 shows the full-length cDNA sequence (SEQ ID NO:2) of HDAC10. The full-length 
cDNA sequence starts at nucleotide position 1 and ends at nucleotide position 1755. 

Figure 3 shows the open reading frame of HDAC10 cDNA sequence (SEQ ID NO:3). The 
sequence starts at nucleotide position 25 and ends at nucleotide position 1065 as indicated in SEQ ID 
NO:2. 

Figure 4 shows HDAC10 genomic DNA sequence (SEQ ID NO:4). 

Detailed Description of the Invention 

, In practicing the present invention, many conventional techniques in molecular biology, 
microbiology, and recombinant DNA are used. These techniques are well known to one of ordinary 
skill in the art. The following abbreviations used throughout the disclosure are listed herein below: 
histone deacetylase (HDAC), histone deacetylase-like protein (HDLP) 

In its broadest sense, the term "substantially similar", when used herein with respect to a 
nucleotide sequence, means a nucleotide sequence corresponding to a reference nucleotide sequence, 
wherein the corresponding sequence encodes a polypeptide having substantially the same structure 
and function as the polypeptide encoded by the reference nucleotide sequence, e.g. where only 
changes in amino acids not affecting the polypeptide function occur. Desirably the substantially 
similar nucleotide sequence encodes the polypeptide encoded by the reference nucleotide sequence. 
The percentage of identity between the substantially similar nucleotide sequence and the reference 
nucleotide sequence desirably is at least 80%, more desirably at least 85%, preferably at least 90%, 
more preferably at least 95%, still more preferably at least 98 or 99%. Sequence comparisons are 
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carried out using Clustalw (see, for example, Higgins, D.G. et al. Methods Enzymol. 266:383-402 
(1996)). Clustalw alignments were performed using default parameters. 

A nucleotide sequence "substantially similar" to reference nucleotide sequence hybridizes to 
the reference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA 
at 50°C with washing in 2X SSC, 0.1% SDS at 50°C, more desirably in 7% sodium dodecyl sulfate 
(SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in IX SSC, 0.1% SDS at 50°C, more 
desirably still in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing 
in 0.5X SSC, 0.1% SDS at 50°C, preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 
mM EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 50°C, more preferably in 7% sodium 
dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 
65°C, yet still encodes a functionally equivalent gene product. 

"Elevated transcription of mRNA" refers to a greater amount of messenger RNA transcribed 
from the natural endogenous human gene encoding the novel polypeptide of the present invention 
present in an appropriate tissue or cell of an individual suffering from a condition associated with 
abnormal HDAC10 activity than in a subject not suffering from such a disease or condition; in 
particular at least about twice, preferably at least about five times, more preferably at least about ten 
times, most preferably at least about 100 times the amount of mRNA found in corresponding tissues 
in humans who do not suffer from such a condition. Such elevated level of mRNA may eventually 
lead to increased levels of protein translated from such mRNA in an individual suffering from a 
condition associated with abnormal cellular proliferation as compared with a healthy individual. It is 
also understood that "elevated transcription of mRNA" may refer to a greater amount of messenger 
RNA transcribed from genes the expression of which is modulated by HDAC10 either alone or in 
combination with other molecules. 

A "host cell," as used herein, refers to a prokaryotic or eukaryotic cell that contains 
heterologous DNA that has been introduced into the cell by any means, e.g., electroporation, calcium 
phosphate precipitation, microinjection, transformation, viral infection, and the like. 

"Heterologous" as used herein means "of different natural origin" or represent a non-natural 
state. For example, if a host cell is transformed with a DNA or gene derived from another organism, 
particularly from another species, that gene is heterologous with respect to that host cell and also with 
respect to descendants of the host cell which carry that gene. Similarly, heterologous refers to a 
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nucleotide sequence derived from and inserted into the same natural, original cell type, but which is 
present in a non-natural state, e.g. a different copy number, or under the control of different regulatory 
elements. 

A "vector" molecule is a nucleic acid molecule into which heterologous nucleic acid may be 
inserted which can then be introduced into an appropriate host cell. Vectors preferably have one or 
more origin of replication, and one or more site into which the recombinant DNA can be inserted. 
Vectors often have convenient means by which cells with vectors can be selected from those without, 
e.g., they encode drug resistance genes. Common vectors include plasmids, viral genomes, and 
(primarily in yeast and bacteria) "artificial chromosomes." 

"Plasmids" generally are designated herein by a lower case p preceded and/or followed by 
capital letters and/or numbers, in accordance with standard naming conventions that are familiar to 
those of skill in the art. Starting plasmids disclosed herein are either commercially available, publicly 
available on an unrestricted basis, or can be constructed from available plasmids by routine 
application of well-known, published procedures. Many plasmids and other cloning and expression 
vectors that can be used in accordance with the present invention are well known and readily available 
to those of skill in the art. Moreover, those of skill readily may construct any number of other 
plasmids suitable for use in the invention. The properties, construction and use of such plasmids, as 
well as other vectors, in the present invention will be readily apparent to those of skill from the 
present disclosure. 

The term "isolated" means that the material is removed from its original environment (e.g., 
the natural environment if it is naturally occurring). For example, a naturally occurring 
polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide 
or polypeptide, separated from some or all of the coexisting materials in the natural system, is 
isolated, even if subsequently reintroduced into the natural system. Such polynucleotides could be 
part of a vector and/or such polynucleotides or polypeptides could be part of a composition, and still 
be isolated in that such vector or composition is not part of its natural environment. 

As used herein, the term "transcriptional control sequence" refers to DNA sequences, such as 
initiator sequences, enhancer sequences, and promoter sequences, which induce, repress, or otherwise 
control the transcription of protein encoding nucleic acid sequences to which they are operably linked. 
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As used herein, "human transcriptional control sequences" are any of those transcriptional 
control sequences normally found associated with the human gene encoding the novel HDAC10 
polypeptide of the present invention as it is found in the respective human chromosome. It is 
understood that the term may also refer to transcriptional control sequences normally found associated 
with human genes the expression of which is modulated by HDAC10 either alone or in combination 
with other molecules. 

As used herein, "non-human transcriptional control sequence" is any transcriptional control 
sequence not found in the human genome. 

The term "polypeptide" is used interchangeably herein with the terms "polypeptides" and 
"protein(s)". 

As used herein, a "chemical derivative" of a polypeptide of the invention is a polypeptide of 
the invention that contains additional chemical moieties not normally a part of the molecule. Such 
moieties may improve the molecule's solubility, absorption, biological half life, etc. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect 
of the molecule, etc. Moieties capable of mediating such effects are disclosed, for example, in 
Remington's Pharmaceutical Sciences, 16th ed., Mack Publishing Co., Easton, Pa. (1980). 

As used herein, "HDAC 10" refers to the amino acid sequences of substantially purified 
HDAC10 obtained from any species, particularly mammalian, including bovine, ovine, porcine, 
murine, equine, and preferably human, from any source, whether natural, synthetic, semi-synthetic, or 
recombinant. 

As used herein, "HDAC activity", including "HDAC10 activity" refers to the ability of an 
HDAC polypeptide to deacetylate histone proteins, including 3 H«labeled H4 histone peptide. Such 
activity may be measured according to conventional methods. A biologically "active" protein refers to 
a protein having structural, regulatory, or biochemical functions of a naturally occurring molecule. 

The term "agonist", as used herein, refers to a molecule which when bound to HDAC10, causes 
a change in HDAC10 which modulates the activity of HDAC10. Agonists may include proteins, 
nucleic acids, carbohydrates, or any other molecules that bind to HDAC 10. 
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The terms "antagonist" or "inhibitor" as used herein, refer to a molecule which when bound to 
HDAC10, blocks or modulates the biological activity of HDAC10. Antagonists and inhibitors may 
include proteins, nucleic acids, carbohydrates, or any other molecules, natural or synthetic that bind to 
HDAC10. 

The full-length cDNA for HDAC10 is 1755 base pairs in length and it predicts a protein of 347 
amino acids. The predicted HDAC10 protein possesses a putative catalytic domain which 
encompasses approximately 317 amino acids (-6 to 323) based upon alignments of HDAC10 with the 
putative catalytic domains of all of the other known HDACs. To identify the catalytic domain of 
HDAC10, Clustalw alignments were performed separately using HDAC10 complete peptide and 
catalytic domain sequences from class I HDACs (1-3 and 8) or class II HDACs (4-7). 

Table 2 below shows the catalytic domain amino acids of HDACs 1-10 that align with histone 
deacetylase-like protein (HDLP), a bacterial protein that shares 35.2% homology with HDAC1 and 
possesses deacetylase activity (Finnin, M. S., Doniglan, J. R., Cohen, A., Richon, V. M., Rifkind, R. 
A., Marks, P. A., Breslow, R., and Pavletich, N. P. (1999) Nature 401, 188-193). 



Table 2. HDAC catalytic amino acids 
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Isoform 








Amino acids in the catalytic domains of HDAC isoforms 




HDLP 


Pro 


His 


His 


Gly 


Phe 


Asp 


Asp 


His 


Asp 


Phe 


Asp 


Leu 


Tyr 
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140 


141 
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168 
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198 
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HDAC1 


Pro 
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Tyr 
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Pro 


His 


His 


Gly 


Phe 
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Asp 


His 
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Leu 


Tyr 


HDAC4 


Pro 
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His 


Gly 
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Asp 


His 


Asn 


Phe 
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His 
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His 
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Phe 
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HDAC6-1 
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His 
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His 
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Asp 
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Asp 
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Phe 
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Met 


Tyr 


HDAC 9 


Pro 
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His 


Gly 
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Gin 


Phe 


Asp 


Glu 


Tyr 


HDAC10 
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Asp 
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Tyr 


Asp 
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36 


142 


143 


151 


152 


179 


181 


183 


186 


209 


261 
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304 



talicized amino acids represent amino acids that are not always conserved. 
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As a member of the HDAC family, HDAC10 may form biologically relevant complexes with 
proteins and display functions that have been described for other HDACs. For example, it is likely to 
be involved in transcription repression as a component of multi-protein complexes that often include 
transcription co-repressors. Thus, increased activity or expression of HDAC 10 may be associated with 
numerous pathological conditions, including but not limited to, abnormal cell proliferation, cancer, 
atherosclerosis, inflammatory bowel disease, host inflammatory or immune response, or psoriasis. 

t 

Thus, the identification of HDAC10 is useful for designing agents (e.g. antagonists or 
inhibitors) useful to ameliorate conditions associated with abnormal HDAC activity. These may 
include, for example, antiproliferative or antiinflammatory agents either through the use of small 
molecules or proteins (e.g. antibodies) directed against it or its associated proteins in the HDAC 
transcription repressor complexes. In addition, protein derived from the HDAC 10 sequence may also 
be used as a therapeutic to modify host cell proliferative or inflammatory responses. 

To determine the tissue distribution of HDAC 10 in human, Northern analyses were performed 
using a blot containing mRNA isolated from various human tissues. The results indicate that overall 
expression level of HDAC 10 is low and the highest expression level is restricted to brain, heart, 
skeletal muscle and kidney. Furthermore, real-time PCR experiments reveal that HDAC 10 is also 
highly expressed in testis as well as several human cancereous cell lines. Thus, HDAC 10 represents a 
transcribed gene. 

In one aspect, the present invention relates to a novel histone deacetylase (HDAC). As 
outlined above, HDAC 10 is clearly a member of the HDAC family since it is highly similar to other 
HDAC proteins, especially in the catalytic domain. 

The present invention relates to an isolated polypeptide comprising the amino acid sequence 
set forth in SEQ ID NO:l . For example, such a polypeptide may be a fusion protein including the 
amino acid sequence of the novel HDAC 10. In another aspect the present invention relates to an 
isolated polypeptide consisting of the amino acid sequence set forth in SEQ ID NO:l, which is, in 
particular, the novel HDAC 10. 

The invention includes nucleic acid or nucleotide molecules, preferably DNA molecules, in 
particular encoding the novel HDAC10. Preferably, an isolated nucleic acid molecule, preferably a 
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DNA molecule, of the present invention encodes a polypeptide comprising the amino acid sequence 
set forth in SEQ ID NO: 1. Likewise preferred is an isolated nucleic acid molecule, preferably a DNA 
molecule, encoding a polypeptide consisting of the amino acid sequence set forth in SEQ ID NO: 1 . 
Such a nucleic acid or nucleotide, in particular such a DNA molecule, preferably comprises a 
nucleotide sequence selected from the group consisting of (1) the nucleotide sequence as set forth in 
SEQ ID NO:2, which is the full-length cDNA sequence encoding the polypeptide consisting of the 
amino acid sequence set forth in SEQ ID NO: 1 : (2) the nucleotide sequence set forth in SEQ ID 
NO:3, which corresponds to the open reading frame of the cDNA sequence set forth in SEQ ID NO:2; 
(3) a nucleotide sequence capable of hybridizing under high stringency conditions to a nucleotide 
sequence set forth in SEQ ID NO:3; and (4) the nucleotide sequence set forth in SEQ ID NO:4, which 
corresponds to the endogenous genomic human DNA encoding the polypeptide consisting of the 
amino acid sequence set forth in SEQ ID NO:l. Such hybridization conditions may be highly stringent 
or less highly stringent, as described above. In instances wherein the nucleic acid molecules are 
deoxyoligonucleotides ("oligos"), highly stringent conditions may refer, e.g., to washing in 6X 
SSC/0.05% sodium pyrophosphate at 37 °C (for 14-base oligos), 48 °C (for 17-base oligos), 55 °C 
(for 20-base oligos), and 60 °C (for 23-base oligos). Suitable ranges of such stringency conditions for 
nucleic acids of varying compositions are described in Krause and Aaronson (1991), Methods in 
Enzymology, 200:546-556. 

These nucleic acid molecules may act as target gene antisense molecules, useful, for example, 
in target gene regulation and/or as antisense primers in amplification reactions of target gene nucleic 
acid sequences. Further, such sequences may be used as part of ribozyme and/or triple helix 
sequences, also useful for target gene regulation. Still further, such molecules may be used as 
components of diagnostic methods whereby the presence of an allele causing a disease associated 
with abnormal HDAC10 expression or activity, for example, abnormal cell proliferation, cancer, 
atherosclerosis, inflammatory bowel disease, host inflammatory or immune response, or psoriasis, 
may be detected. 

The invention also encompasses (a) vectors that contain at least a fragment of any of the 
foregoing nucleotide sequences and/or their complements (i.e., antisense); (b) vector molecules, 
preferably vector molecules comprising transcriptional control sequences, in particular expression 
vectors, that contain any of the foregoing coding sequences operatively associated with a regulatory 
element that directs the expression of the coding sequences; and (c) genetically engineered host cells 
that contain a vector molecule as mentioned herein or at least a fragment of any of the foregoing 
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nucleotide sequences operatively associated with a regulatory element that directs the expression of 
the coding sequences in the host cell. As used herein, regulatory elements include, but are not limited 
to, inducible and non-inducible promoters, enhancers, operators and other elements known to those 
skilled in the art that drive and regulate expression. Preferably, host cells can be vertebrate host cells, 
preferably mammalian host cells, such as human cells or rodent cells, such as CHO or BHK cells. 
Likewise preferred, host cells can be bacterial host cells, in particular E.coli cells. 

Particularly preferred is a host cell, in particular of the above described type, which can be 
propagated in vitro and which is capable upon growth in culture of producing an HDAC10 
polypeptide, in particular a polypeptide comprising or consisting of an amino acid sequence set forth 
in SEQ ID NO:l, wherein said cell contains some fragment or complete sequence of HD AC 10 coding 
sequence in a construct that is controlled by one or more transcriptional control sequences that is not a 
transcriptional control sequence of the natural endogeneous human gene encoding said polypeptide, 
wherein said one or more transcriptional control sequences control transcription of a DNA encoding 
said polypeptide. Possible transcriptional control sequences include, but are not limited to, bacterial 
or viral promoter sequences. 

The invention includes the complete sequence of the gene as well as fragments of any of the 
nucleic acid sequences disclosed herein. Fragments of the nucleic acid sequences encoding the novel 
HDAC10 polypeptide may be used as a hybridization probe for a cDNA library to isolate other genes 
which have a high sequence similarity to the HDAC10 gene or similar biological activity. Probes of 
this type preferably have at least about 30 bases and may contain, for example, from about 30 to about 
50 bases, about 50 to about 100 bases, about 100 to about 200 bases, or more than 200 bases. The 
probe may also be used to identify a cDNA clone that correspond to a full-length transcript and a 
genomic clone or clones that contain the complete HDAC10 gene including regulatory and promoter 
regions, exons, and introns. An example of a screen comprises isolating the coding region of the 
HDAC10 gene by using the known DNA sequence to synthesize an oligonucleotide probe. Labeled 
oligonucleotides having a sequence complementary to that of the gene of the present invention may be 
used to screen a library of human cDNA, genomic DNA or mRNA to determine which members of 
the library to which the probe hybridizes. 

In addition to the gene sequences described above, homologs of such sequences, as may, for 
example, be present in other species, may be identified and may be readily isolated, without undue 
experimentation, by molecular biological techniques well known in the art. Furthermore, there may 
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exist genes at other genetic loci within the genome that encode proteins that have homology to one or 
more domains of such gene products. These genes may also be identified via similar techniques. For 
example, the isolated nucleotide sequence of the present invention encoding the novel HDAC10 
polypeptide may be labeled and used to screen a cDNA library constructed from mRNA obtained 
from the organism of interest. Hybridization conditions will be of a lower stringency when the cDNA 
library is derived from an organism different from the type of organism from which the labeled 
sequence was derived. Alternatively, the labeled fragment may be used to screen a genomic library 
derived from the organism of interest, again, using appropriately stringent conditions. Such low 
stringency conditions will be well known to those of skill in the art, and will vary predictably 
depending on the specific organisms from which the library and the labeled sequences are derived. 

Further, a previously unknown differentially expressed gene-type sequence may be isolated 
by performing PCR using two degenerate oligonucleotide primer pools designed on the basis of amino 
acid sequences within the gene of interest. The template for the reaction may be cDNA obtained by 
reverse transcription of mRNA prepared from human or non-human cell lines or tissue known or 
suspected to express a differentially expressed gene allele. The PCR product may be subcloned and 
sequenced to ensure that the amplified sequences represent the sequences of a differentially expressed 
gene-like nucleic acid sequence. The PCR fragment may then be used to isolate a complete cDNA 
clone by a variety of conventional methods. For example, the amplified fragment may be labeled and 
used to screen a bacteriophage cDNA library. Alternatively, the labeled fragment may be used to 
screen a genomic library. 

PCR technology may also be utilized to isolate full-length cDNA sequences. For example, 
RNA may be isolated, following standard procedures, from an appropriate cellular or tissue source. A 
reverse transcription reaction may be performed on the RNA using an oligonucleotide primer specific 
for the most 5' end of the amplified fragment for the priming of first strand synthesis. The resulting 
RNA/DNA hybrid may then be "tailed" with guanines using a standard terminal transferase reaction, 
the hybrid may be digested with RNAase H, and second strand synthesis may then be primed with a 
poly-C primer. Thus, cDNA sequences upstream of the amplified fragment may easily be isolated. 

In cases where the gene identified is the normal, or the wild type gene, this gene may be used 
to isolate mutant alleles of the gene. Isolation of mutant alleles is preferable in processes and 
disorders that are known or suspected to have a genetic basis. Mutant alleles may be isolated from 
individuals either known or suspected to have a genotype which contributes to disease symptoms 
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related to abnormal HDAC activity, including, but not limited to, conditions such as abnormal cell 
proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or immune 
response, or psoriasis. Mutant alleles and mutant allele products may then be used in the diagnostic 
assay systems described below. 

A cDNA of the mutant gene may be isolated, for example, using PCR, a technique that is well 
known to those of skill in the art. In this case, the first cDNA strand may be synthesized by 
hybridizing an oligo-dT oligonucleotide to mRNA isolated from tissue known or suspected to be 
expressed in an individual putatively carrying the mutant allele, and by extending the new strand with 
reverse transcriptase. The second strand of the cDNA is then synthesized using an oligonucleotide 
that hybridizes specifically to the 5' end of the normal gene. Using these two primers, the product is 
then amplified via PCR, cloned into a suitable vector, and subjected to DNA sequence analysis 
through methods well known to those of skill in the art. By comparing the DNA sequence of the 
mutant gene to that of the normal gene, the mutation(s) responsible for the loss or alteration of 
function of the mutant gene product can be ascertained. 

Alternatively, a genomic or cDNA library can be constructed and screened using DNA or 
RNA, respectively, from a tissue known to or suspected of expressing the gene of interest in an 
individual suspected of or known to carry the mutant allele. The normal gene or any suitable fragment 
thereof may then be labeled and used as a probe to identify the corresponding mutant allele in the 
library. The clone containing this gene may then be purified through methods routinely practiced in 
the art, and subjected to sequence analysis as described above. 

Additionally, an expression library can be constructed utilizing DNA isolated from or cDNA 
synthesized from a tissue known to or suspected of expressing the gene of interest in an individual 
suspected of or known to cany the mutant allele. In this manner, gene products made by the putatively 
mutant tissue may be expressed and screened using standard antibody screening techniques in 
conjunction with antibodies raised against the normal gene product, as described below. In cases 
where the mutation results in an expressed gene product with altered function (e.g., as a result of a 
mis-sense mutation), a polyclonal set of antibodies are likely to cross-react with the mutant gene 
product. Library clones detected via their reaction with such labeled antibodies can be purified and 
subjected to sequence analysis as described above. 
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The present invention includes those proteins encoded by nucleotide sequences set forth in 
any of SEQ ID NOs:2, 3 or 4, in particular, a polypeptide that is or includes the amino acid sequence 
set out in SEQ ID NO: 1 , or fragments thereof. 

Furthermore, the present invention includes proteins that represent functionally equivalent 
gene products. Such an equivalent differentially expressed gene product may contain deletions, 
additions or substitutions of amino acid residues within the amino acid sequence encoded by the 
differentially expressed gene sequences described, above, but which result in a silent change, thus 
producing a functionally equivalent differentially expressed gene product. Amino acid substitutions 
may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, 
and/or the amphipathic nature of the residues involved. 

Nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine. Polar neutral amino acids include glycine, serine, 
threonine, cysteine, tyrosine, asparagine, and glutamine. Positively charged (basic) amino acids 
include arginine, lysine, and histidine; and negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. "Functionally equivalent," as utilized herein, may refer to a protein or 
polypeptide capable of exhibiting a substantially similar in vivo or in vitro activity as the endogenous 
differentially expressed gene products encoded by the differentially expressed gene sequences 
described above. "Functionally equivalent" may also refer to proteins or polypeptides capable of 
interacting with other cellular or extracellular molecules in a manner substantially similar to the way 
in which the corresponding portion of the endogenous differentially expressed gene product would. 
For example, a "functionally equivalent" peptide, the sequence of which was modified from the 
endogenous peptide to achieve "functional equivalency, would be able, in an immunoassay, to 
diminish the binding of an antibody to the corresponding peptide within the endogenous protein, or 
the binding to the endogenous protein itself, against which the antibody was raised. An equimolar 
concentration of the functionally equivalent peptide will diminish the aforesaid binding of the 
corresponding peptide by at least about 5%, preferably between about 5% and 10%, more preferably 
between about 10% and 25%, even more preferably between about 25% and 50%, and most 
preferably between about 40% and 50%. 

The polypeptides of the present invention may be produced by recombinant DNA technology 
using techniques well known in the art. Therefore, there is provided a method of producing a 
polypeptide of the present invention, which method comprises culturing a host cell having 



WO 03/014340 



PCT/EP02/08654 



-18- 

incorporated therein an expression vector containing an exogenously-derived polynucleotide encoding 
a polypeptide comprising an amino acid sequence as set forth in SEQ ID NO: 1 under conditions 
sufficient for expression of the polypeptide in the host cell, thereby causing the production of the 
expressed polypeptide. Optionally, said method further comprises recovering the polypeptide 
produced by said cell. In a preferred embodiment of such a method, said exogenously-derived 
polynucleotide encodes a polypeptide consisting of an amino acid sequence set forth in SEQ ID NO:l. 
Preferably, said exogenously-derived polynucleotide comprises the nucleotide sequence as set forth in 
any of SEQ ID NO:2, SEQ ID NO:3 or SEQ ID NO:4. In case of using the nucleotide sequence set 
forth in SEQ ID NO:3, i.e. the open reading frame, the sequence, when inserted into a vector, may be 
followed by one or more appropriate translation stop codons, preferably by the natural endogenous 
stop codon TGA beginning at nucleotide 1066 in the cDNA sequence. 

Thus, methods for preparing the polypeptides and peptides of the invention by expressing 
nucleic acid encoding respective nucleotide sequences are described herein. Methods which are well- 
known to those skilled in the art can be used to construct expression vectors that contain protein 
coding sequences and appropriate transcriptional/translational control signals. These methods include, 
for example, in vitro recombinant DNA techniques, synthetic techniques and in vivo 
recombination/genetic recombination. Alternatively, RNA capable of encoding differentially 
expressed gene protein sequences may be chemically synthesized using, for example, synthesizers. 

A variety of host-expression vector systems may be utilized to express the HDAC10 gene 
coding sequences of the invention. Such host-expression systems represent vehicles by which the 
coding sequences of interest may be produced and subsequently purified, but also represent cells 
which may, when transformed or transfected with the appropriate nucleotide coding sequences, 
exhibit the HDAC10 gene protein of the invention in situ. These include, but are not limited to, 
microorganisms such as bacteria (e.g., E. coli, B. subtilis) transformed with recombinant 
bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing differentially 
expressed gene protein coding sequences; yeast (e.g. Saccharomyces, Pichia) transformed with 
recombinant yeast expression vectors containing the differentially expressed gene protein coding 
sequences; insect cell systems infected or transfected with recombinant virus expression vectors (e.g., 
baculovirus) containing the differentially expressed gene protein coding sequences; plant cell systems 
infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco 
mosaic virus, TMV) or transformed with recombinant vectors, including plasmids, (e.g., Ti plasmid) 
containing protein coding sequences; or mammalian cell systems (e.g. COS, CHO, BHK, 293, 3T3) 
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harboring recombinant expression constructs containing promoters derived from the genome of 
mammalian cells (e.g., metallothioneine promoter) or from mammalian viruses (e.g., the adenovirus 
late promoter; the vaccinia virus 7.5K promoter, or the CMV promoter). 

Expression of the HDAC10 of the present invention by a cell from an HDAC10 encoding 
gene that is native to the cell can also be performed. Such methods are known in the art. Cells that 
have been induced to express HDAC10 can be implanted into a desired tissue in a living animal in 
order to increase the local concentration of HD AC 10 in the tissue. 

In bacterial systems, a number of expression vectors may be advantageously selected 
depending upon the use intended for the protein being expressed. For example, when a large quantity 
of such a protein is to be produced, for the generation of antibodies or to screen peptide libraries, for 
example, vectors which direct the expression of high levels of fusion protein products that are readily 
purified may be desirable. In this respect, fusion proteins comprising hexahistidine tags may be used, 
such as EpiTag vectos including pCDNA3.1/His (Invitrogen, Carlsbad, CA). Other vectors include, 
but are not limited, to the E. coli expression vector pUR278 (Ruther et al., 1983, EMBO J. 2:1791), in 
which the protein coding sequence may be ligated individually into the vector in frame with the lac Z 
coding region so that a fusion protein is produced; pIN vectors; and the like. pGEX vectors may also 
be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In 
general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption to 
glutathione-agarose beads followed by elution in the presence of free glutathione. The pGEX vectors 
are designed to include thrombin or factor Xa protease cleavage sites so that the cloned target gene 
protein can be released from the GST moiety. Fusion proteins containing Flag tags, such as 3X Flag 
(Sigma, St. Louis, MO) or myc tags, for example pCDNA3.1/myc-His (Invitrogen, Carlsbad, CA) may 
be used. These fusions allow coimmunoprecipitation and Western detection of proteins for which 
antibodies are not yet available. 

Promoter regions from any desired gene can be introduced into vectors containing a reporter 
transcription unit, such as a chloramphenicol acetyl transferase ("CAT"), or the luciferase 
transcription unit, which also lack a promoter region. Restriction site or sites in the vector can be used 
for introducing a candidate promoter fragment; i.e., a fragment that may contain a promoter. For 
example, introduction into the vector of a promoter-containing fragment at the restriction site 
upstream of the cat gene engenders production of CAT activity, which can be detected by standard 
CAT assays. Vectors suitable to this end are well known and readily available. Two such vectors are 
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pKK232-8 and pCM7. Thus, promoters for expression of polynucleotides of the present invention 
include not only well-known and readily available promoters, but also promoters that readily may be 
obtained by the foregoing technique, using a reporter gene. 

Among known bacterial promoters suitable for expression of polynucleotides and 
polypeptides in accordance with the present invention are the E. coli lad and lacZ promoters, the T3 
and T7 promoters, the T5 tac promoter, the lambda PR, PL promoters and the trp promoter. Among 
known eukaryotic promoters suitable in this regard are the CMV immediate early promoter, the HSV 
thymidine kinase promoter, the early and late SV40 promoters, the promoters of retroviral LTRs, such 
as those of the Rous sarcoma virus ("RSV"), and metallothionein promoters, such as the mouse 
metallothionein-I promoter. For example, a plasmid construct could contain a HDAC10 
transcriptional control sequence fused to a reporter transcription unit that encodes the coding region 
of 0-Galactosidase, chloramphenicol acetyltransferase, green fluorescent protein or luciferase . This 
construct could be used to screen for small molecules that modulate HDAC10 transcription. Such 
molecules are potential therapeutics. Furthermore, using fluorescence microscopy or Biophotonic in 
vivo imaging, a technology that produces visual and quantitative measurements in real time (Xenogen, 
Palo Alto, CA), expression of a fluorescent HDAC10 reporter gene could be examined to determine 
the effects of an HDAC10 therapeutic in mammalian cells or xenografts. Changes in these reporters in 
normal, diseased or drug-treated tissue or cells would be indicators of changes in HDAC10 expression 
or activity. 

In an insect system, Autographa californica nuclear polyhedrosis virus (AcNPV) is one of 
several insect systems that can be used as a vector to express foreign genes. The virus grows in 
Spodoptera frugiperda cells. The coding sequence may be cloned individually into non-essential 
regions (for example the polyhedrin gene) of the virus and placed under control of an AcNPV 
promoter (for example the polyhedrin promoter). Successful insertion of the coding sequence will 
result in inactivation of the polyhedrin gene and production of non-occluded recombinant virus (i.e., 
virus lacking the proteinaceous coat coded for by the polyhedrin gene). These recombinant viruses are 
then used to infect Spodoptera frugiperda cells in which the inserted gene is expressed. 

In mammalian host cells, a number of viral-based expression systems may be utilized. In cases 
where an adenovirus is used as an expression vector, the coding sequence of interest may be ligated to 
an adenovirus transcription/translation control complex, e.g., the late promoter and tripartite leader 
sequence. This chimeric gene may then be inserted in the adenovirus genome by in vitro or in vivo 
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recombination. Insertion in a non-essential region of the viral genome (e.g., region El or E3) will 
result in a recombinant virus that is viable and capable of expressing the desired protein in infected 
hosts (e.g., See Logan & Shenk, 1984, Proc. Natl. Acad. Sci. USA 81:3655-3659). Specific initiation 
signals may also be required for efficient translation of inserted gene coding sequences. These signals 
include the ATG initiation codon and adjacent sequences. In cases where an entire gene, including its 
own initiation codon and adjacent sequences, is inserted into the appropriate expression vector, no 
additional translational control signals may be needed. However, in cases where only a portion of the 
gene coding sequence is inserted, exogenous translational control signals, including, perhaps, the 
ATG initiation codon, must be provided. Furthermore, the initiation codon must be in phase with the 
reading frame of the desired coding sequence to ensure translation of the entire insert. These 
exogenous translational control signals and initiation codons can be of a variety of origins, both 
natural and synthetic. The efficiency of expression may be enhanced by the inclusion of appropriate 
transcription enhancer elements, transcription terminators, etc.. Other common systems are based on 
SV40, retrovirus or adeno-associated virus. Selection of appropriate vectors and promoters for 
expression in a host cell is a well known procedure and the requisite techniques for expression vector 
construction, introduction of the vector into the host and expression in the host per se are routine 
skills in the art. Generally, recombinant expression vectors will include origins of replication, a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural 
sequence, and a selectable marker to permit isolation of vector containing cells after exposure to the 
vector. 

In addition, a host cell strain may be chosen which modulates the expression of the inserted 
sequences, or modifies and processes the gene product in the specific fashion desired. Such 
modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may be 
important for the function of the protein. Different host cells have characteristic and specific 
mechanisms for the post-translational processing and modification of proteins. Appropriate cell lines 
or host systems can be chosen to ensure the correct modification and processing of the foreign protein 
expressed. To this end, eukaryobc host cells that possess the cellular machinery for proper processing 
of the primary transcript, glycosylation, and phosphorylation of the gene product may be used. Such 
mammalian host cells include but are not limited to CHO, VERO, BHK, HeLa, COS, MDCK, 293, 
3T3, WI38, etc. and are well known to one of skill in the art. 

For long-term, high-yield production of recombinant proteins, stable expression is preferred. 
For example, cell lines that stably express a differentially expressed protein product of a gene may be 
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engineered. Rather than using expression vectors which contain viral origins of replication, host cells 
can be transformed with DNA controlled by appropriate expression control elements (e.g., promoter, 
enhancer, sequences, transcription terminators, polyadenylation sites, etc.), and a selectable marker. 
Following the introduction of the foreign DNA, engineered cells may be allowed to grow for 1-2 days 
in an enriched media, and then are switched to a selective media. The selectable marker in the 
recombinant plasrnid confers resistance to the selection and allows cells to stably integrate the 
plasmid into their chromosomes and grow to form foci which in turn can be cloned and expanded into 
cell lines. This method may advantageously be used to engineer cell lines that express the 
differentially expressed gene protein. Such engineered cell lines may be particularly useful in 
screening and evaluation of compounds that affect the endogenous activity of the expressed protein. 

A number of selection systems may be used, including but not limited to, the herpes simplex 
virus thymidine kinase, hypoxanthine-guanine phosphoribosyltransferase and adenine 
phosphoribosyltransferase genes can be employed in tk", hgprt" or aprt* cells, respectively. Also, 
antimetabolite resistance can be used as the basis of selection for dhfr, which confers resistance to 
methotrexate, gpt, which confers resistance to mycophenolic acid; neo, which confers resistance to the 
aminoglycoside G-418; and hygro, which confers resistance to hygromycin genes. 

An alternative fusion protein system allows for the ready purification of non-denatured fusion 
proteins expressed in human cell lines. Li this system, the gene of interest is subcloned into a vaccinia 
recombination plasmid such that the gene's open reading frame is translationally fused to an amino- 
terminal tag consisting of six histidine residues. Extracts from cells infected with recombinant 
vaccinia virus are loaded onto Ni 2+ nitriloacetic acid-agarose columns and histidine-tagged proteins 
are selectively eluted with imidazole-containing buffers. 

When used as a component in assay systems such as those described below, a protein of the 
present invention may be labeled, either directly or indirectly, to facilitate detection of a complex 
formed between the protein and a test substance. Any of a variety of suitable labeling systems may be 
used including, but not limited to, radioisotopes such as ,25 I; enzyme labeling systems that generate a 
detectable calorimetric signal or light when exposed to substrate; and fluorescent labels. 

Where recombinant DNA technology is used to produce a protein of the present invention for 
such assay systems, it may be advantageous to engineer fusion proteins that can facilitate labeling, 
immobilization, detection and/or isolation 
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Indirect labeling involves the use of a protein, such as a labeled antibody, which specifically 
binds to a polypeptide of the present invention. Such antibodies include but are not limited to 
polyclonal, monoclonal, chimeric, single chain, Fab fragments and fragments produced by an Fab 
expression library. 

In another embodiment, nucleic acids comprising a sequence encoding HDAC10 protein or 
functional derivative thereof, may be administered to promote normal biological function, for 
example, normal transcriptional regulation, by way of gene therapy. Gene therapy refers to therapy 
performed by the administration of a nucleic acid to a subject. In this embodiment of the invention, 
the nucleic acid produces its encoded protein that mediates a therapeutic effect by promoting normal 
transcriptional regulation. 

Any of the methods for gene therapy available in the art can be used according to the present 
invention. Exemplary methods are described below. 

In a preferred aspect, the therapeutic comprises a HDAC10 nucleic acid that is part of an 
expression vector that expresses a HDAC10 protein or fragment or chimeric protein thereof in a 
suitable host. In particular, such a nucleic acid has a promoter operably linked to the HDAC10 coding 
region, said promoter being inducible or constitutive, and, optionally, tissue-specific. In another 
particular embodiment, a nucleic acid molecule is used in which the HDAC10 coding sequences and 
any other desired sequences are flanked by regions that promote homologous recombination at a 
desired site in the genome, thus providing for intrachromosomal expression of the HDAC10 nucleic 
acid. 

Delivery of the nucleic acid into a patient may be either direct, in which case the patient is 
directly exposed to the nucleic acid or nucleic acid-carrying vector, or indirect, in which case, cells 
are first transformed with the nucleic acid in vitro, then transplanted into the patient. These two 
approaches are known, respectively, as in vivo or ex vivo gene therapy. 

In a specific embodiment, the nucleic acid is directly administered in vivo, where it is 
expressed to produce the encoded product. This can be accomplished by any of numerous methods 
known in the art, for example, by constructing it as part of an appropriate nucleic acid expression 
vector and administering it so that it becomes intracellular, e.g., by infection using a defective or 
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attenuated retroviral or other viral vector, or by direct injection of naked DNA, or by use of 
microparticle bombardment (e.g., a gene gun; Biolistic, Dupont), or coating with lipids or cell-surface 
receptors or transfecting agents, encapsulation in liposomes, microparticles, or microcapsules, or by 
administering it in linkage to a peptide which is known to enter the nucleus, by administering it in 
linkage to a ligand subject to receptor-mediated endocytosis (which can be used to target cell types 
specifically expressing the receptors), etc. In another embodiment, a nucleic acid-ligand complex can 
be formed in which the ligand comprises a fusogenic viral peptide to disrupt endosomes, allowing the 
nucleic acid to avoid lysosomal degradation. In yet another embodiment, the nucleic acid can be 
targeted in vivo for cell specific uptake and expression, by targeting a specific receptor. Alternatively, 
the nucleic acid can be introduced intracellular^ and incorporated within host cell DNA for 
expression, by homologous recombination 

In a specific embodiment, a viral vector that contains the HDAC10 nucleic acid is used. For 
example, a retroviral vector can be used. These retroviral vectors have been modified to delete 
retroviral sequences that are not necessary for packaging of the viral genome and integration into host 
cell DNA. The HDAC10 nucleic acid to be used in gene therapy is cloned into the vector, which 
facilitates delivery of the gene into a patient. 

Adenoviruses are other viral vectors that can be used in gene therapy. Adenoviruses are 
especially attractive vehicles for delivering genes to respiratory epithelia. Adenoviruses naturally 
infect respiratory epithelia where they cause a mild disease. Other targets for adenovirus-based 
delivery systems are liver, the central nervous system, endothelial cells, and muscle. Adenoviruses 
have the advantage of being capable of infecting non-dividing cells. Adeno-associated virus (AAV) 
has also been proposed for use in gene therapy. 

Another approach to gene therapy involves transferring a gene to cells in tissue culture by 
such methods as electroporation, lipofection, calcium phosphate mediated transfection, or viral 
infection. Usually, the method of transfer includes the transfer of a selectable marker to the cells. The 
cells are then placed under selection to isolate those cells that have taken up and are expressing the 
transferred gene. Those cells are then delivered to a patient. 

In this embodiment, the nucleic acid is introduced into a cell prior to administration in vivo of 
the resulting recombinant cell. Such introduction can be carried out by any method known in the art, 
including but not limited to transfection, electroporation, microinjection, infection with a viral or 
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bacteriophage vector containing the nucleic acid sequences, cell fusion, chromosome-mediated gene 
transfer, microcell-mediated gene transfer, spheroplast fusion, etc. Numerous techniques are known in 
the art for the introduction of foreign genes into cells and may be used in accordance with the present 
invention, provided that the necessary developmental and physiological functions of the recipient 
cells are not disrupted. The technique should provide for the stable transfer of the nucleic acid to the 
cell, so that the nucleic acid is expressible by the cell and preferably heritable and expressible by its 
cell progeny. 

The resulting recombinant cells can be delivered to a patient by various methods known in the 
art. In a preferred embodiment, epithelial cells are injected, e.g., subcutaneously. In another 
embodiment, recombinant skin cells may be applied as a skin graft onto the patient. Recombinant 
blood cells (e.g., hematopoietic stem or progenitor cells) are preferably administered intravenously. 
The amount of cells envisioned for use depends on the desired effect, patient state, etc., and can be 
determined by one skilled in the art. 

Cells into which a nucleic acid can be introduced for purposes of gene therapy encompass any 
desired, available cell type, and include but are not limited to epithelial cells, endothelial cells, 
keratinocytes, fibroblasts, muscle cells, hepatocytes; blood cells such as T lymphocytes, B 
lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes; 
various stem or progenitor cells, in particular hematopoietic stem or progenitor cells, e.g., as obtained 
from bone marrow, umbilical cord blood, peripheral blood, fetal liver, etc. 

In a preferred embodiment, the cell used for gene therapy is autologous to the patient. 

In an embodiment in which recombinant cells are used in gene therapy, a HDAC10 nucleic 
acid is introduced into the cells such that it is expressible by the cells or their progeny, and the 
recombinant cells are then administered in vivo for therapeutic effect. In a specific embodiment, stem 
or progenitor cells are used. Any stem-and/or progenitor cells that can be isolated and maintained in 
vitro can potentially be used in accordance with this embodiment of the present invention. Such stem 
cells include but are not limited to hematopoietic stem cells (HSC), stem cells of epithelial tissues 
such as the skin and the lining of the gut, embryonic heart muscle cells, liver stem cells, and neural 
stem cells. 
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Epithelial stem cells (ESCs) or keratinocytes can be obtained from tissues such as the skin 
and the lining of the gut by known procedures. In stratified epithelial tissue such as the skin, renewal 
occurs by mitosis of stem cells within the germinal layer, the layer closest to the basal lamina. Stem 
cells within the lining of the gut provide for a rapid renewal rate of this tissue. ESCs or keratinocytes 
obtained from the skin or lining of the gut of a patient or donor can be grown in tissue culture. If the 
ESCs are provided by a donor, a method for suppression of host versus graft reactivity (e.g., 
irradiation, drug or antibody administration to promote moderate immunosuppression) can also be 
used. 

With respect to hematopoietic stem cells (HSC), any technique which provides for the 
isolation, propagation, and maintenance in vitro of HSC can be used in this embodiment of the 
invention. Techniques by which this may be accomplished include (a) the isolation and establishment 
of HSC cultures from bone marrow cells isolated from the future host, or a donor, or (b) the use of 
previously established long-term HSC cultures, which may be allogeneic or xenogeneic. Non- 
autologous HSC are used preferably in conjunction with a method of suppressing transplantation 
immune reactions of the future host/patient. In a particular embodiment of the present invention, 
human bone marrow cells can be obtained from the posterior iliac crest by needle aspiration. In a 
preferred embodiment of the present invention, the HSCs can be made highly enriched or in 
substantially pure form This enrichment can be accomplished before, during, or after long-term 
culturing, and can be done by any techniques known in the art. Long-term cultures of bone marrow 
cells can be established and maintained by using, for example, modified Dexter cell culture 
techniques or Witlock-Witte culture techniques. 

In a specific embodiment, the nucleic acid to be introduced for purposes of gene therapy 
comprises an inducible promoter operably linked to the coding region, such that expression of the 
nucleic acid is controllable by controlling the presence or absence of the appropriate inducer of 
transcription. 

A further embodiment of the present invention relates to a purified antibody or a fragment 
thereof which specifically binds to a polypeptide that comprises the amino acid sequence set forth in 
SEQ ID NO: 1 or to a fragment of said polypeptide. A preferred embodiment relates to a fragment of 
such an antibody, which fragment is an Fab or F(ab") 2 fragment. In particular, the antibody can be a 
polyclonal antibody or a monoclonal antibody. 



WO 03/014340 



-27- 



PCT/EP02/08654 



Methods for the production of antibodies capable of specifically recognizing one or more 
differentially expressed gene epitopes are known to one of ordinary skill in the art. Such antibodies 
may include, but are not limited to polyclonal antibodies, monoclonal antibodies (mAbs), humanized 
or chimeric antibodies, single chain antibodies, Fab fragments, F(ab')2 fragments, fragments produced 
by a Fab expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any 
of the above. Such antibodies may be used, for example, in the detection of a fingerprint, target, gene 
in a biological sample, or, alternatively, as a method for the inhibition of abnormal target gene 
activity. Thus, such antibodies may be utilized as part of disease treatment methods, and/or may be 
used as part of diagnostic techniques whereby patients may be tested for abnormal levels of the 
HDAC10 polypeptide, or for the presence of abnormal forms of the HDAC10 polypeptide. 

For the production of antibodies to the HD AC 1 0 polypeptide, various host animals may be 
immunized by injection with the HDAC10 polypeptide, or a portion thereof. Such host animals may 
include but are not limited to rabbits, mice, and rats, to name but a few. Various adjuvants may be 
used to increase the immunological response, depending on the host species, including but not limited 
to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active 
substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet 
hemocyanin, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette- 
Guerin) and Corynebacterium parvum. 

Polyclonal antibodies are heterogeneous populations of antibody molecules derived from the 
sera of animals immunized with an antigen, such as target gene product, or an antigenic functional 
derivative thereof. For the production of polyclonal antibodies, host animals such as those described 
above, may be immunized by injection with the HDAC10 polypeptide, or a portion thereof, 
supplemented with adjuvants as also described above. 

Monoclonal antibodies, which are homogeneous populations of antibodies to a particular 
antigen, may be obtained by any technique that provides for the production of antibody molecules by 
continuous cell lines in culture. These include, but are not limited to the hybridoma technique, the 
human B-cell hybridoma technique, and the EBV-hybridoma technique. Such antibodies may be of 
any immunoglobulin class including IgG, IgM, IgE, IgA, IgD and any subclass thereof. The 
hybridoma producing the mAb of this invention may be cultivated in vitro or in vivo. Production of 
high titers of mAbs in vivo makes this the presently preferred method of production. 
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In addition, techniques developed for the production of "chimeric antibodies" by splicing the 
genes from a mouse antibody molecule of appropriate antigen specificity together with genes from a 
human antibody molecule of appropriate biological activity can be used. A chimeric antibody is a 
molecule in which different portions are derived from different animal species, such as those having a 
variable or hypervariable region derived from a murine mAb and a human immunoglobulin constant 
region. 

Alternatively, techniques described for the production of single chain antibodies can be 
adapted to produce differentially expressed gene-single chain antibodies. Single chain antibodies are 
formed by linking the heavy and light chain fragments of the Fv region via an amino acid bridge, 
resulting in a single chain polypeptide. 

Most preferably, techniques useful for the production of "humanized antibodies" can be 
adapted to produce antibodies to the polypeptides, fragments, derivatives, and functional equivalents 
disclosed herein. 

Antibody fragments that recognize specific epitopes may be generated by known techniques. 
For example, such fragments include but are not limited to: the F(aV)2 fragments which can be 
produced by pepsin digestion of the antibody molecule and the Fab fragments which can be generated 
by reducing the disulfide bridges of the Ffab^ fragments. Alternatively, Fab expression libraries may 
be constructed (Huse et al., 1989, Science, 246: 1275-1281) to allow rapid and easy identification of 
monoclonal Fab fragments with the desired specificity. 

An antibody of the present invention can be preferably used in a method for the diagnosis of a 
condition associated with abnormal HDAC10 expression or activity, for example, abnormal cell 
proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or immune 
response, or psoriasis, in a human which comprises: measuring the amount of a polypeptide 
comprising the amino acid sequence set forth in SEQ ID NO:l, or fragments thereof, in an appropriate 
tissue or cell from a human suffering from a condition associated with abnormal HDAC10 activity, 
wherein the presence of an elevated amount of said polypeptide or fragments thereof, relative to the 
amount of said polypeptide or fragments thereof in the respective tissue from a human not suffering 
from a condition associated with abnormal HDAC10 activity is diagnostic of said human's suffering 
from such condition. Such a method forms a further embodiment of the present invention. Preferably, 
said detecting step comprises contacting said appropriate tissue or cell with an antibody which 
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specifically binds to a polypeptide that comprises the amino acid sequence set forth in SEQ ID NO: 1 
or a fragment thereof and detecting specific binding of said antibody with a polypeptide in said 
appropriate tissue or cell, wherein detection of specific binding to a polypeptide indicates the 
presence of a polypeptide that comprises the amino acid sequence set forth in SEQ ID NO:l or a 
fragment thereof. 

Particularly preferred, for ease of detection, is the sandwich assay, of which a number of 
variations exist, all of which are intended to be encompassed by the present invention. 

For example, in a typical forward assay, unlabeled antibody is immobilized on a solid 
substrate and the sample to be tested brought into contact with the bound molecule. After a suitable 
period of incubation time sufficient to allow formation of an antibody-antigen binary complex, a 
second antibody, labeled with a reporter molecule capable of inducing a detectable signal, is then 
added and incubated, allowing time sufficient for the formation of a ternary complex of antibody- 
antigen-labeled antibody. Any unreacted material is washed away, and the presence of the antigen is 
determined by observation of a signal, or may be quantitated by comparing with a control sample 
containing known amounts of antigen. Variations on the forward assay include the simultaneous 
assay, in which both sample and antibody are added simultaneously to the bound antibody, or a 
reverse assay in which the labeled antibody and sample to be tested are first combined, incubated and 
added to the unlabeled surface bound antibody. These techniques are well known to those skilled in 
the art, and the possibility of minor variations will be readily apparent. As used herein, "sandwich 
assay" is intended to encompass all variations on the basic two-site technique. For the immunoassays 
of the present invention, the only limiting factor is that the labeled antibody be an antibody that is 
specific for the HDAC 1 0 polypeptide or a fragment thereof. 

The most commonly used reporter molecules in this type of assay are either enzymes, 
fluorophore- or radionuclide-containing molecules. In the case of an enzyme immunoassay an enzyme 
is conjugated to the second antibody, usually by means of glutaraldehyde or periodate. As will be 
readily recognized, however, a wide variety of different ligation techniques exist, which are well 
known to the skilled artisan. Commonly used enzymes include horseradish peroxidase, glucose 
oxidase, beta-galactosidase and alkaline phosphatase, among others. The substrates to be used with 
the specific enzymes are generally chosen for the production, upon hydrolysis by the corresponding 
enzyme, of a detectable color change. For example, p-nitrophenyl phosphate is suitable for use with 
alkaline phosphatase conjugates; for peroxidase conjugates, 1,2-phenylenediamine or toluidine are 
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commonly used. It is also possible to employ fluorogenic substrates, which yield a fluorescent product 
rather than the chromogenic substrates noted above. A solution containing the appropriate substrate is 
then added to the tertiary complex. The substrate reacts with the enzyme linked to the second 
antibody, giving a qualitative visual signal, which may be further quantitated, usually 
spectrophotometrically, to give an evaluation of the amount of HD AC 10 which is present in the serum 
sample. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be chemically 
coupled to antibodies without altering their binding capacity. When activated by illumination with 
light of a particular wavelength, the fluorochrome-labeled antibody absorbs the light energy, inducing 
a state of excitability in the molecule, followed by emission of the light at a characteristic longer 
wavelength. The emission appears as a characteristic color visually detectable with a light 
microscope. Immunofluorescence and EIA techniques are both very well established in the art and 
are particularly preferred for the present method. However, other reporter molecules, such as 
radioisotopes, chemilurninescent or bioluminescent molecules may also be employed. It will be 
readily apparent to the skilled artisan how to vary the procedure to suit the required use. 

This invention also relates to the use of polynucleotides of the present invention as diagnostic 
reagents. In particular, the invention relates to a method for the diagnosis of a condition associated 
with abnormal HDAC10 expression or activity, for example, abnormal cell proliferation, cancer, 
atherosclerosis, inflammatory bowel disease, host inflammatory or immune response, or psoriasis in a 
human which comprises: detecting elevated transcription of messenger RNA transcribed from the 
natural endogeneous human gene encoding the polypeptide consisting of an amino acid sequence set 
forth in SEQ ID NO: 1 in an appropriate tissue or cell from a human, wherein said elevated 
transcription is diagnostic of said human's suffering from the condition associated with abnormal 
HDAC10 expression or activity. In particular, said natural endogeneous human gene comprises the 
nucleotide sequence set forth in SEQ ID NO:4. In a preferred embodiment such a method comprises 
contacting a sample of said appropriate tissue or cell or contacting an isolated RNA or DNA molecule 
derived from that tissue or cell with an isolated nucleotide sequence of at least about 20 nucleotides in 
length that hybridizes under high stringency conditions with the isolated nucleotide sequence 
encoding a polypeptide consisting of an amino acid sequence set forth in SEQ ID NO:l. 

Detection of a mutated form of the gene characterized by the polynucleotide of SEQ ID NO:4 
which is associated with a dysfunction will provide a diagnostic tool that can add to, or define, a 
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diagnosis of a disease, or susceptibility to a disease, which results from under-expression, over- 
expression or altered spatial or temporal expression of the gene. Individuals carrying mutations in the 
gene may be detected at the DNA level by a variety of techniques. 

Nucleic acids, in particular mRNA, for diagnosis may be obtained from a subject's cells, such 
as from blood, urine, saliva, tissue biopsy or autopsy material. The genomic DNA may be used 
directly for detection or may be amplified enzymatically by using PCR or other amplification 
techniques prior to analysis. RNA or cDNA may also be used in similar fashion. Deletions and 
insertions can be detected by a change in size of the amplified product in comparison to the normal 
genotype. Point mutations can be identified by hybridizing amplified DNA to labeled nucleotide 
sequences which encode the HDAC10 polypeptide of the present invention. Perfectly matched 
sequences can be distinguished from mismatched duplexes by RNase digestion or by differences in 
melting temperatures. DNA sequence differences may also be detected by alterations in 
electrophoretic mobility of DNA fragments in gels, with or without denaturing agents, or by direct 
DNA sequencing. Sequence changes at specific locations may also be revealed by nuclease protection 
assays, such as RNase and SI protection or the chemical cleavage method. In another embodiment, an 
array of oligonucleotides probes comprising nucleotide sequence encoding the HDAC10 polypeptide 
of the present invention or fragments of such a nucleotide sequence can be constructed to conduct 
efficient screening of e.g., genetic mutations. Array technology methods are well known and have 
general applicability and can be used to address a variety of questions in molecular genetics including 
gene expression, genetic linkage, and genetic variability. 

The diagnostic assays offer a process for diagnosing or determining a susceptibility to disease 
through detection of mutation in the HDAC10 gene by the methods described. In addition, such 
diseases may be diagnosed by methods comprising determining from a sample derived from a subject 
an abnormally decreased or increased level of polypeptide or mRNA. Decreased or increased 
expression can be measured at the RNA level using any of the methods well known in the art for the 
quantitation of polynucleotides, such as, for example, nucleic acid amplification, for instance PCR, 
RT-PCR, RNase protection, Northern blotting and other hybridization methods. Assay techniques that 
can be used to determine levels of a protein, such as a polypeptide of the present invention, in a 
sample derived from a host are well-known to those of skill in the art. Such assay methods include 
radioimmunoassays, competitive-binding assays, Western Blot analysis and ELISA assays. 

Thus in another aspect, the present invention relates to a diagnostic kit which comprises: 
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(a) a polynucleotide of the present invention, preferably the nucleotide sequence of SEQ ID 
NO:2, 3 or 4, or a fragment thereof; 

(b) a nucleotide sequence complementary to that of (a); 

(c) a polypeptide of the present invention, preferably the polypeptide of SEQ ID NO:l or a 
fragment thereof; or 

(d) an antibody to a polypeptide of the present invention, preferably to the polypeptide of 
SEQ ID NO: 1. 

It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial 
component. Such a kit will be of use in diagnosing a disease or susceptibility to a disease, particularly 
to a disease or condition associated with abnormal HDAC10 expression or activity, for example, 
abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or 
immune response, or psoriasis. 

The nucleotide sequences of the present invention are also valuable for chromosome 
localization. The sequence is specifically targeted to, and can hybridize with, a particular location on 
an individual human chromosome. The mapping of relevant sequences to chromosomes according to 
the present invention is an important first step in correlating those sequences with gene-associated 
disease. Once a sequence has been mapped to a precise chromosomal location, the physical position 
of the sequence on the chromosome can be correlated with genetic map data. The relationship 
between genes and diseases that have been mapped to the same chromosomal region are then 
identified through linkage analysis (coinheritance of physically adjacent genes). 

The differences in the cDNA or genomic sequence between affected and unaffected 
individuals can also be determined. If a mutation is observed in some or all of the affected individuals 
but not in any normal individuals, then the mutation is likely to be the causative agent of the disease. 

An additional embodiment of the invention relates to the administration of a pharmaceutical 
composition, in conjunction with a pharmaceutically acceptable carrier, excipient or diluent, for any 
of the therapeutic effects discussed above. Such pharmaceutical compositions may consist of 
HDAC10, antibodies to that polypeptide, mimetics, agonists, antagonists, or inhibitors of HDAC10 
function. The compositions may be administered alone or in combination with at least one other 
agent, such as stabilizing compound, which may be administered in any sterile, biocompatible 
pharmaceutical carrier, including, but not limited to, saline, buffered saline, dextrose, and water. The 
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compositions may be administered to a patient alone, or in combination with other agents, drugs or 
hormones. 

In addition, any of the therapeutic proteins, antagonists, antibodies, agonists, antisense 
sequences or vectors described above may be administered in combination with other appropriate 
therapeutic agents. Selection of the appropriate agents for use in combination therapy may be made 
by one of ordinary skill in the art, according to conventional pharmaceutical principles. The 
combination of therapeutic agents may act synergistically to effect the treatment or prevention of the 
various disorders described above. Using this approach, one may be able to achieve therapeutic 
efficacy with lower dosages of each agent, thus reducing the potential for adverse side effects. 
Antagonists and agonists of HDAC10 may be made using methods that are generally known in the art. 

The pharmaceutical compositions encompassed by the invention may be administered by any 
number of routes including, but not limited to, oral, intravenous, intramuscular, intra-articular, intra- 
arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, 
intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions may contain suitable 
pharmaceutically acceptable carriers comprising excipients and auxiliaries which facilitate processing 
of the active compounds into preparations which can be used pharmaceutically. Further details on 
techniques for formulation and administration may be found in the latest edition of Remington's 
Pharmaceutical Sciences (Maack Publishing Co., Easton, Pa.). 

Pharmaceutical compositions for oral administration can be formulated using 
pharmaceutically acceptable carriers well known in the art in dosages suitable for oral administration. 
Such carriers enable the pharmaceutical compositions to be formulated as tablets, pills, dragees, 
capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by the patient. 

Pharmaceutical preparations for oral use can be obtained through combination of active 
compounds with solid excipient, optionally grinding a resulting mixture, and processing the mixture 
of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, such as methyl cellulose, 
hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums including arabic and 
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tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt thereof, such 
as sodium alginate. 

Dragee cores may be used in conjunction with suitable coatings, such as concentrated sugar 
solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, polyethylene 
glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
Dyestuffs or pigments may be added to the tablets or dragee coatings for product identification or to 
characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. 
Push-fit capsules can contain active ingredients mixed with a filler or binders, such as lactose or 
starches, lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In soft capsules, 
the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid, or 
liquid polyethylene glycol with or without stabilizers. 

Pharmaceutical formulations suitable for parenteral administration may be formulated m 
aqueous solutions, preferably in physiologically compatible buffers such as Hanks' solution, Ringer's 
solution, or physiologically buffered saline. Aqueous injection suspensions may contain substances 
which increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Additionally, suspensions of the active compounds may be prepared as appropriate oily 
injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or 
synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Non-lipid polycationic 
amino polymers may also be used for delivery. Optionally, the suspension may also contain suitable 
stabilizers or agents which increase the solubility of the compounds to allow for the preparation of 
highly concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 

The pharmaceutical compositions of the present invention may be manufactured in a manner 
that is known in the art, e.g., by means of conventional mixing, dissolving, granulating, dragee- 
making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. 
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The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, etc. 
Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free 
base forms. In other cases, the preferred preparation may be a lyophilized powder which may contain 
any or all of the following: 1-50 mM histidine, 0. l%-2% sucrose, and 2-7% mannitol, at a pH range 
of 4.5 to 5.5, that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, they can be placed in an appropriate 
container and labeled for treatment of an indicated condition. For administration of the HDAC10, 
such labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include compositions wherein 
the active ingredients are contained in an effective amount to achieve the intended purpose. The 
determination of an effective dose is well within the capability of those skilled in the art. 

For any compound, the therapeutically effective dose can be estimated initially either in cell 
culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or pigs. The 
animal model may also be used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example 
HDAC10 or fragments thereof, antibodies of HDAC10, agonists, antagonists or inhibitors of 
HDAC10, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., 
ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 50% 
of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index, and it 
can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies is 
used in formulating a range of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, sensitivity of the 
patient, and the route of administration. 
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The exact dosage will be determined by the practitioner, in light of factors related to the 
subject that requires treatment. Dosage and administration are adjusted to provide sufficient levels of 
the active moiety or to maintain the desired effect. Factors which may be taken into account include 
the severity of the disease state, general health of the subject, age, weight, and gender of the subject, 
diet, time and frequency of administration, drug combination(s), reaction sensitivities, and 
tolerance/response to therapy. Long-acting pharmaceutical compositions may be administered every 3 
to 4 days, every week, or once every two weeks depending on half-life and clearance rate of the 
particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of about 
1 g, depending upon the route of administration. Guidance as to particular dosages and methods of 
delivery is provided in the literature and generally available to practitioners in the art. Those skilled in 
the art will employ different formulations for nucleotides than for proteins or their inhibitors. 
Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, conditions, 
locations, etc. Pharmaceutical formulations suitable for oral administration of proteins are known in 
the art. 

All patent applications, patents and literature references cited herein are hereby incorporated 
by reference in their entirety. 

The following Examples illustrate the present invention, without in any way limiting the 
scope thereof. 

Example 1 : HP AC 10 protein expression in vivo 

An expression vector containing HDAClO's coding sequences plus the Flag-epitope encoding 
sequences at the C-terminus is transfected into 293 embryonic kidney cells using the GenePORTER2 
transfection reagent (Gene Therapy System Inc., San Diego, CA). Forty-eight hr. after transfection, 
cell ly sates are prepared from the transfected cells and 10 ug of total protein is subjected to SDS- 
PAGE on a 10% Tris-glycine gel. The proteins are then transferred onto a PVDF membrane and 
probed with an anti-Flag antibody, followed by a secondary antibody that is conjugated with 
horseredish peroxidase, which allows for detection of signal using enhanced luminescence reagents. 
The anti-Flag antibody detects the HDACIO-Flag fusion protein as a single band of 39 kDa in size, 
which agrees with the estimated size of HDAC10 protein based on its amino acid composition. 
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Example 2: Distribution of HDAC1Q mRNA in normal human tissues and cancer cell lines 

A multiple human tissue Northern blot is purchased from Clontech (Palo Alto, CA). A 32 P- 
labeled probe corresponding to HDAC10 cDNA (nucleotide no. 181 to no.l 122) is prepared using the 
Rediprime DNA labeling system (Amersham Pharmacia Biotech) according to the manufacturer's 
instructions. The Northern blot is pre-hybridized and hybridized in the presence of the 32 P-labeled 
probe under stringent conditions according to the manufacturer's protocol. A probe corresponding to 
human actin cDNA (Clontech) is used as a control for the relative amount of mRNA in each lane. 
Results of Northern analyses indicate that there are two spliced variant forms of HDAC10 mRNA, 
one is ~1.7kb, which agrees with the size of the full-length cDNA (SEQ ID NO:2); the other is ~3.2kb 
and is expressed at a higher level. The larger transcript agrees with the size of a Macacfascicularis 
brain cDNA clone (GenBank™ accession #AB052134), which encodes a truncated HDAC10 
polypeptide (minus the first 29 amino acids) with 3 conservative amino acid substitutions. Northern 
analyses also show that overall expression level of HDAC10 mRNA is low and high expression level 
is restricted to brain, heart, skeletal muscle and kidney. These findings imply that the HDAC10 gene 
is expressed in normal human tissues and that HDAClO's function may be tissue-specific. 

In addition to Northern blotting, the Real-time PCR technique is used to examine HDAC10 
mRNA distribution in normal human tissues as well as several human cancer cell lines. These 
experiments confirm findings of the Northern analyses; in addition, they reveal high expression level 
of HDAC10 in testis. Furthermore, our data indicate that large amount of HDAC10 mRNA is also 
found in a non-small cell lung carcinoma cell line, a rhabdomyosarcoma muscle tumor line, a urinary 
bladder cancer cell line and an osteosarcoma cell line. Taken together, these results indicate that 
HDAC10 may function not only in normal human tissues, but also in the development and/or 
maintenance of human cancers. 

Example 3:In vitro HP AC enzyme assay 

To determine whether the putative HDAC "10" is an active deacetylase, transfected Flag 
epitope-tagged recombinant HDAC10 is used to measure the ability of HDAC10 to deacetylate 
histone H4 peptide. Enzymatic activity may be determined according to conventional methods, such 
as the following techniques: 
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Preparation ofHDAClO-Flag expression vector. Using conventional techniques in molecular 
biology, a Flag-epitope sequence is added to the C-terminus of HDAC10 coding sequences (SEQ ID 
NO:3) by PCR. The PCR primers are: 

Fonvard: 5 '-GAGGAICCACCATGCTACACACAACCCAGCTG- 3* 

Reverse: 5 ^GC GTCTAG A CL4CnTGTCATCGTCGTCCTTGTAATCAGCCCGGGGC- 

ACTGCAGGGGGAAG- 3'. 

The BamHI and Xbal restriction enzyme cutting sites are underlined, the ATG translational start site 
is bolded in the forward primer and the Flag-epitope endocding sequences are bolded in the reverse 
primer. The Flag-tagged HDAC10 PCR fragment is cloned into the pcDNA3.1(+) expression vector 
between the BamHI and Xbal sites. 

Transfection and Immunoprecipitation. Approximately IxlO 7 293 human embryonic kidney 
cells were grown in a 15-cm 2 plate (-50% confluent) on the day of transfection. GenePORTER 
transfection reagent (Gene Therapy Systems, Inc., San Diego, CA) is used to transfect 30 jig of 
plasmid DNA per plate of cells according to manufacturer's instructions. Forty-eight hr after 
transfection, cells are washed twice with ice-cold phosphate-buffered saline (PBS) and resuspended in 
1 mL ice-cold lysis buffer (50 mM Tris-Cl, pH 7.4, 120 mM NaCl, 0.5 mM EDTA, 05% NP-40) 
supplemented with EDTA-free protease inhibitor complete (Roche Molecular Biochemicals, 
Indianapolis, IN). The lysate is incubated at 4°C for 20 min on a rotator, followed by spinning at 
12,000 x g for 20 min at 4°C. The soluble supernatant is collected and used for immunoprecipitation 
with 20 jil anti-FLAG M2 affinity gel (Sigma, Saint Louis, MI) at 4°C overnight. As a negative 
control, 1 mL lysis buffer is used instead of the cell lysate. The immnuoprecipitated complex is 
pelleted by centrifugation and washed three times with 1 mL ice-cold lysis buffer, four times with 
lysis buffer containing 1 M NaCl and three times with 1 mL HDAC assay buffer (10 mM Tris-Cl, pH 
8.0, 10 mM NaCl, 10% glycerol). 

In vitro HDAC enzyme assay. The immunoprecipitated complex is suspended in 30 \x\ HDAC 
assay buffer containing 30,000 cpm of the acetylated histone H4 peptide. Histone deacetylase activity 
is determined after incubation at 37°C for 3 hr as described (Emiliani, S., Fischle, W., Van Lint, C, 
Al-Abed, Y., and Verdin, E. (1998) Proc. Natl Acad. Sci. USA. 95, 2795-2800). 
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Results of the in vitro HDAC enzyme assays show that cells expressing the HDACIO-Flag 
fusion protein contain 2.5-3 fold higher enzyme activity than cells expressing the pcDNA3.1(+) vector 
alone. Therefore, HDAC10 is likely to contain intrinsic histone deacetylase enzyme activity. 

Example 4: Identification of HDAC 10 associated protein 

Using conventional methods, proteins in the same complex as HDAC10 may be identified by 
their ability to coimmunoprecipitate with HDACIO-Flag fusion protein. The HDACIO-Flag 
expression vector or the vector alone is transfected into 293 cells and cell lysates are prepared as 
described above. The lysates are precleared with Sepharose A/G plus agarose beads, followed by 
immunoprecipiation using anti-Flag antibody at 4°C overnight on a rotator as described in example 3. 
The immune complexes are washed twice with ice-cold lysis buffer (see example 3), twice with lysis 
buffer containing 1 M NaCl and twice with PBS. The final complexes are separated by SDS-PAGE on 
10% Tris-glycine gels, transferred onto a PVDF membrane and probed with antibodies against known 
HDAC-associated proteins or other HDACs. Conversely, the immunoprecipitation could be done 
using antibodies of choice, and the resulting immune complexes could be probed with anti-Flag 
antibody. 
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What is claimed is : 

1 . An isolated polypeptide comprising the amino acid sequence set forth in SEQ ID NO: 1 . 

2. An isolated DNA comprising a nucleic acid sequence that encodes the polypeptide of claim 1 . 

3. A vector molecule comprising at least a fragment of the isolated DNA according to claim 2. 

4. The vectQr molecule according to claim 3 comprising transcriptional control sequences. 

5 . A host cell comprising the vector molecule according to claim 4. 

6. The isolated DNA according to claim 2, comprising a nucleotide sequence selected from the 
group consisting of (1) the nucleotide sequence set forth in SEQ ED NO:2; (2) the nucleotide sequence 
set forth in SEQ ID NO:3; (3) a nucleotide sequence capable of hybridizing under high stringency 
conditions to a nucleotide sequence set forth in SEQ ID NO:3; and (4) the nucleotide sequence set 
forth in SEQIDNO:4. 

7. A vector molecule comprising the isolated DNA molecule according to claim 6, or a fragment 
thereof 

8. The vector molecule according to claim 7 comprising transcriptional control sequences. 

9. A host cell comprising the vector molecule according to claim 8. 

10. A host cell which can be propagated in vitro and which is capable upon growth in culture of 
expressing HDAC 10, wherein said cell comprises at least one transcriptional control sequence that is 
not a transcriptional control sequence of the natural endogeneous human gene encoding HDAC 10, 
wherein said one or more transcriptional control sequences control transcription of a DNA encoding 
HDAC 10. 

11. A method for the diagnosis of a condition associated with abnormal regulation of gene 
expression which includes, abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel 
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disease, host inflammatory or immune response, or psoriasis in a human which comprises: detecting 
abnormal transcription of messenger RNA transcribed from the natural endogeneous human gene 
encoding HDAC 10 in an appropriate tissue or cell from a human, wherein said abnormal 
transcription is diagnostic of said condition. 

12. The method of claim 1 1 , wherein said natural endogeneous human gene comprises the 
nucleotide sequence set forth in SEQ ID NO:4. 

13. A method for the diagnosis of a condition associated with abnormal HDAC10 expression or 
activity in a human which comprises: 

measuring the amount of HDAC 10, or fragments thereof, in an appropriate tissue or cell from a 
human suffering from said condition wherein the presence of an abnormal amount of said polypeptide 
or fragments thereof, relative to the amount of said polypeptide or fragments thereof in the respective 
tissue from a human not suffering from said condition associated with abnormal HDAC 10 expression 
or activity is diagnostic of said human's suffering from said condition. 

14. The method of claim 13, wherein said detecting step comprises contacting said appropriate 
tissue or cell with an antibody which specifically binds to a polypeptide that comprises the amino acid 
sequence set forth in SEQ ID NO: 1 or a fragment thereof and detecting specific binding of said 
antibody with a polypeptide in said appropriate tissue or cell, wherein detection of specific binding to 
a polypeptide indicates the presence of a polypeptide that comprises the amino acid sequence set forth 
in SEQ ID NO:l or a fragment thereof. 

15. An antibody or a fragment thereof which specifically binds to a polypeptide that comprises the 
amino acid sequence set forth in SEQ ID NO: 1 or to a fragment of said polypeptide. 

16. An antibody fragment according to claim 15 which is an Fab or F(ab*)2 fragment. 

17. An antibody according to claim 15 which is a polyclonal antibody. 

18. An antibody according to claim 1 5 which is a monoclonal antibody. 

1 9. A method for producing an HDAC 1 0 polypeptide, which method comprises: 
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culturing a host cell having incorporated therein an expression vector comprising an 
exogenously-derived polynucleotide encoding a polypeptide comprising an amino acid sequence as 
set forth in SEQ ID NO: 1 or a nucleotide sequence capable of hybridizing under high stringency 
conditions to a complement of said polynucleotide, under conditions sufficient for expression of the 
polypeptide in the host cell, thereby causing the production of the expressed polypeptide. 

20. The method according to claim 1 9, wherein said exogenously-derived polynucleotide 
hybridizes under stringent conditions to the nucleotide sequence as set forth in SEQ ID NO:2. 

2 1 . The method according to claim 1 9, wherein said exogenously-derived polynucleotide comprises 
the nucleotide sequence as set forth in SEQ ID NO:3. 

22. A histone deacetylace which comprises the catalytic domain of HDAC 10. 
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SEQIDNO:! 



MLHTTQLYQH VPETPWPIVY SPRYNITFMG 
VEAREASEED LLWHTRRYL NELKWSPAVA 
TIMAGKLAVE RGWAINVGGG FHHCSSDRGG 
DAHQGNGHER DFMDDKRVYI MDVYNRHIYP 
NIKKSLQEHL PDVWYNAGT DILEGDRLGG 
SGGYQKRTAR IIADSILNLF GLGLIGPESP 



LEKLHPFDAG KWGKVINFLK EEKLLSDSML 60 

TITEIPPVIF LPNFLVQRKV LRPLRTQTGG 120 

GFCAYADITL AIKFLFERVE GISRATIIDL 180 

GDRFAKQAIR RKVELEWGTE DDEYLDKVER 240 

LSISPAGIVK RDELVFRMVR GRRVPILMVT 300 
SVSAQNSDTP LLPPAVP 



SEQIDNO:2 

1 agctttggga gggccggccc cgggatgcta 
61 gagacaccct ggccaatcgt gtactcgccg 
121 aagctgcatc cctttgatgc cggaaaatgg 
181 aagcttctgt ctgacagcat gctggtggag 
241 gtggtgcaca cgaggcgcta tcttaatgag 
301 acagaaatcc cccccgttat cttcctcccc 
361 ccccttcgga cccagacagg aggaaccata 
421 tgggccatca acgtgggggg tggcttccac 
481 tgtgcctatg cggacatcac gctcgccatc 
541 tccagggcta ccatcattga tcttgatgcc 
601 atggacgaca agcgtgtgta catcatggat 
661 cgctttgcca agcaggccat caggcggaag 
721 gagtacctgg ataaggtgga gaggaacatc 
781 gtggtggtat acaatgcagg caccgacatc 
841 atcagcccag cgggcatcgt gaagcgggat 
901 cgggtgccca tccttatggt gacctcaggc 
961 gctgactcca tacttaatct gtttggcctg 
1021 tccgcacaga actcagacac accgctgctt 
1081 tgcctgtcac gtggccctgc ctatccgccc 
1141 ggtggtggag gcagccttca gtgagcatgg 
1201 gagctggccc ttcctctact tttccctgct 
1261 gtgggggcag aaggcagagc ctgtgtccca 
1321 ggtccaggga ggcaggcagt taactgagaa 
1381 gcgagggccc tgggcttggg gtgttctggt 
1441 ggaagcttcc acctccatcc tgactaggcc 
1501 ttggtcatgg gatttgctgc cctctttgcc 
1561 ggatggccca ggaggtgctg gagctaggtc 
1621 tgggaaccct gggcctggat gtgaggggcg 
1681 tctggagttc cccctcaata aagcaaggtc 
1741 aaaaaaaaaa aaaaa 



cacacaaccc agctgtacca gcatgtgcca 
cgctacaaca tcaccttcat gggcctggag 
ggcaaagtga tcaatttcct aaaagaagag 
gcgcgggagg cctcggagga ggacctgctg 
ctcaagtggt cctttgctgt tgctaccatc 
aacttccttg tgcagaggaa ggtgctgagg 
atggcgggga agctggctgt ggagcgaggc 
cactgctcca gcgaccgtgg cgggggcttc 
aagtttctgt ttgagcgtgt ggagggcatc 
catcagggca atgggcatga gcgagacttc 
gtctacaacc gccacatcta cccaggggac 
gtggagctgg agtggggcac agaggatgat 
aagaaatccc tccaggagca cctgcccgac 
ctcgaggggg accgccttgg ggggctgtcc 
gagctggtgt tccggatggt ccgtggccgc 
gggtaccaga agcgcacagc ccgcatcatt 
gggctcattg ggcctgagtc acccagcgtc 
ccccctgcag tgccctgacc cttgctgccc 
cttagtgctt tttgttttct aacctcatgg 
aggggcaggg ccatccctgg ctggggcctg 
ggaagccaga agggcttgag gcctctatgg 
gggggaccca cacgaagtca ccagcccata 
ttggagagga caggctaggt cccaggcaca 
tttgagaacg gcagacccag gtcggagtga 
tgcatcctaa ctgggcctcc ctccctcccc 
ccagagctga agagctatag gcactggtgt 
tccaggtggg cctggttccc aggcagcagg 
gtcaggaagg ggtacaggtg ggttccctca 
tggacctgca aaaaaaaaaa aaaaaaaaaa 



SEQ ID NO: 3 

25 atgcta 

61 gagacaccct ggccaatcgt gtactcgccg 

121 aagctgcatc cctttgatgc cggaaaatgg 

181 aagcttctgt ctgacagcat gctggtggag 

241 gtggtgcaca cgaggcgcta tcttaatgag 

301 acagaaatcc cccccgttat cttcctcccc 

361 ccccttcgga cccagacagg aggaaccata 

421 tgggccatca acgtgggggg tggcttccac 

481 tgtgcctatg cggacatcac gctcgccatc 

541 tccagggcta ccatcattga tcttgatgcc 

601 atggacgaca agcgtgtgta catcatggat 

661 cgctttgcca agcaggccat caggcggaag 



cacacaaccc agctgtacca gcatgtgcca 
cgctacaaca tcaccttcat gggcctggag 
ggcaaagtga tcaatttcct aaaagaagag 
gcgcgggagg cctcggagga ggacctgctg 
ctcaagtggt cctttgctgt tgctaccatc 
aacttccttg tgcagaggaa ggtgctgagg 
atggcgggga agctggctgt ggagcgaggc 
cactgctcca gcgaccgtgg cgggggcttc 
aagtttctgt ttgagcgtgt ggagggcatc 
catcagggca atgggcatga gcgagacttc 
gtctacaacc gccacatcta cccaggggac 
gtggagctgg agtggggcac agaggatgat 
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721 gagtacctgg ataaggtgga gaggaacatc 

781 gtggtggtat acaatgcagg caccgacatc 

841 atcagcccag cgggcatcgt gaagcgggat 

901 cgggtgccca tccttatggt gacctcaggc 

961 gctgactcca tacttaatct gtttggcctg 

1021 tccgcacaga actcagacac accgctgctt 



aagaaatccc tccaggagca cctgcccgac 
ctcgaggggg accgccttgg ggggctgtcc 
gagctggtgt tccggatggt ccgtggccgc 
gggtaccaga agcgcacagc ccgcatcatt 
gggctcattg ggcctgagtc acccagcgtc 
ccccctgcag tgccc tga 



SEQ ID NO: 4 



ctacacacaa 
ccgcgctaca 
tggggcaaag 
ttcctccaac 
catcttcagt 
ccaacacaga 
tcctgtgggc 
gatgggcaca 
aggtagagca 
tgattaggaa 
caccacaatc 
aatttttctg 
ccctgatttc 
tgggcacact 
ggctgaagta 
cctaccttta 
ctgtggtttg 
aggaaacatt 
gtaaaaaagt 
ctgcaggctg 
ttgcgaggcc 
gcgcagtggc 
gaggccggga 
aaataaaaaa 
tgaggcagga 
actgcactcc 
ttatttattg 
aggctcattg 
attttgtaga 
gcaatccacc 
ggccaagacc 
gggggaattc 
cacctgctcc 
ggaaacatgg 
actgtccaca 
tacctacccc 
ggctggccct 
agaagcttct 
tggtggtgca 
ggctgcgggc 
agcctgggga 
gggagtctgt 
ccaggccact 
attttaggct 
cagcccctgt 
tgttcatttt 
gagtgcagtg 
cctgtctcac 
aatttttttt 
ctcctgaact 



cccagctgta 
acatcacctt 
tgatcaattt 
ccacctgtcc 
ttcagccctc 
ctcctaatca 
ctaagtcctg 
gggaaggtga 
cctccaccgc 
acatgtgcac 
gcagcagagg 
cctcagcctc 
ttaatatggc 
ggctgcctgc 
acacagttac 
gggctcatcc 
taatcaaggg 
atgtcatcaa 
aaatgcaatg 
tgagactgat 
atgccacact 
tcgtgcctgt 
gtttgagacc 
cttagctggg 
ggatcacttg 
agcctgggct 
agacagggtc 
ccacctcaac 
gatgcggtct 
tgcctcaacc 
ctgtctcttt 
ctaagaagag 
tgaaggttgt 
ggattgctgt 
gtggctgggg 
tggggcaggg 
ggccttgagg 
gtctgacagc 
cacgaggcgc 
ctggggcagg 
agccaagtct 
ggtccccaag 
ctgagggtgg 
tctacattat 
cctccaacca 
tttgttggtt 
atgtgatctc 
cctcctgagt 
gtattttagt 
caggtcatct 



ccagcatgtg 
catgggcctg 
cctaaaaggt 
tctccgtcct 
ggatggcctt 
cgatatgatg 
cctctgccca 
agcttggagg 
acctctcttg 
ccaattccag 
ctcaggagct 
ccaagtagct 
actcattata 
ttgtgacctc 
aagaggcgga 
ccttgagcaa 
gcctgattta 
aatgggaaaa 
aaaacaacag 
tagtggtttg 
gagcctcctg 
aatcccagca 
agcttgggca 
ggtggtggta 
agcccagaag 
acagagcaag 
tcactcccat 
ctccctggct 
cactatgttg 
tcccaaagtg 
aaatgaatta 
tttttctcac 
ctagcacacc 
gtgtacgatg 
aggctacccc 
gctgccacag 
tcagtgggga 
atgctggtgg 
tatcttaatg 
gggctgctgg 
cacagggcac 
agaaggagag 
tgtcctcccc 
gactttcaag 
tacatagctc 
tttgttcttg 
ggctcattgc 
gagtagctag 
agagatgtgg 
gcccacctcg 



ccagagacac 
gagaagctgc 
atggaaggtc 
catccccaac 
ccacccatgc 
tccctgactc 
agaggcctag 
agtccatttc 
attacagatg 
tccagtcctc 
cactgtaacc 
ggaattacag 
agattgtaaa 
tttccaggga 
gttgggtttg 
aatgatgctt 
ggtgggaaat 
ggcagtttca 
tataattcaa 
aacggaagat 
taatatcatc 
ctctgggagg 
acatagcaag 
tgcacctata 
ttcgagggtg 
accttgtctt 
cacccaggct 
taagtgatcc 
tctaggctgg 
ctgggattac 
aaaaaaaaaa 
tctgagggtc 
tgagctctcc 
ttcattgctc 
ttctcagaag 
gccaagtctg 
agcaggatgc 
aggcgcggga 
agctcaaggt 
ccaggagtgg 
ccattcatgt 
aggtcataaa 
ttctccaggg 
ctgtgctctg 
tttcactttg 
aaatggagtc 
aacctccgcc 
gattacaggc 
tttcgccgtg 
gcctcccaaa 



gctggccaat 
atccctttga 
ccccttggac 
ataagcctca 
ttccgcccaa 
agactctccc 
tggaaaggta 
ctaaggttca 
ggggaaattg 
acagcagccc 
tccgcctttc 
gcgtgagcca 
agcccacctg 
aggacacagc 
gaactcagag 
cgaagagcat 
tcacttaaac 
cttgccataa 
tccaggctgg 
gagcaaagca 
agaaggtgga 
ccaaggctag 
atcctgtctc 
gtcctagcta 
cagtgagcta 
gcatttattt 
agagtgcagt 
tcccacctca 
tcttgaactc 
aggcgtgaac 
aagggcgggg 
aacatccctg 
ttgtgactat 
cctggccaga 
gcccacaagc 
cagcctgtgg 
tccctctgtg 
ggcctcggag 
acaggatgtc 
ccagaggcag 
ccctagtgtt 
aaggcagacc 
cgtatgaaag 
tcgacacgcc 
gtctattttg 
tcactctgtc 
ttccgggttc 
gcgtgccacc 
ttggccaagc 
gtgctggggt 



cgtgtactcg 

tgccggaaaa 

tctcatctgc 

ggctctctcc 

aatgattttt 

tggctcccca 

gctgattact 

gagagtcagg 

tgtcctagaa 

tcggggtagg 

aggttcaaac 

ccacacccgg 

tagaccgaac 

tcccattagt 

ctccaggcgc 

atcgttttaa 

ttgttttaaa 

ataggtcatg 

ttactattgc 

caggcaggtg 

gggaggccgg 

gagaacactt 

tacaaaataa 

cttgaaatgc 

tggttgtgcc 

atttgtttat 

ggcggaatca 

tgtttttgat 

ctgggctcaa 

caccacacct 

ggaaggtgga 

acccttgtgc 

cagtggcttg 

gggactggcc 

cagcagtgcc 

gagggtctgg 

gtttcagaag 

gaggacctgc 

gggcctgggg 

gaggtgactc 

ggaggaacat 

tcagtttggg 

ccttcataga 

tccgagaccc 

tttgtttgtt 

gcccaggctg 

aagcaattat . 

atgcctggct 

tggtctcgaa 

tacaggcgtg 



17663325 
17663265 
17663205 
17663145 
17663085 
17663025 
17662965 
17662905 
17662845 
17662785 
17662725 
17662665 
17662605 
17662545 
17662485 
17662425 
17662365 
17662305 
17662245 
17662185 
17662125 
17662065 
17662005 
17661945 
17661885 
17661825 
17661765 
17661705 
17661645 
17661585 
17661525 
17661465 
17661405 
17661345 
17661285 
17661225 
17661165 
17661105 
17661045 
17660985 
17660925 
17660865 
17660805 
17660745 
17660685 
17660625 
17660565 
17660505 
17660445 
17660385 
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agccaccgca 
cttctgctga 
tcagcagaga 
ccttcccaag 
cttcaaagag 
actgagtctc 
acctccgcct 
taggggtgag 
aaaagcccac 
agaaggacac 
ctctttgtag 
agcctgcctt 
gagccctgac 
cctgaggcca 
gtgctcaagg 
gtccagtctt 
aggtgtgaga 
aggttgaaac 
cttggctgcc 
tgggccagga 
cgctcagctg 
ggggggccta 
caggctgcat 
taataataca 
ctgaggcagg 
tgcactccag 
acaggtagag 
cactgaaaaa 
agtcagaaca 
gccgggtgca 
cacttgaggt 
aaactacaaa 
tgaggcagga 
attgcactcc 
aaagaaaaag 
cagagagaat 
attaaaatta 
aagtgaaaaa 
gagttgagac 
tttttttttg 
cagctcacag 
agctaggatt 
ggagttttca 
cctcggcctc 
tagctttgaa 
tccaaagggg 
tacacggtgc 
aggagtgtat 
gtgatatggt 
tctggcacct 
tttcttttct 
ttctcaggct 
tgcctcatcc 
gtggcctttg 
agatgtcaat 
tggcccctct 
ggttgattct 
atcctgcccc 
tcagagggca 
tccctttcct 
agtaatgaga 
aaaggggtgg 



cccaccctat 
agttttgttg 
agagaaacaa 
gggcggctcc 
aagccaggaa 
gctcttgttg 
cccgggttca 
ccaccacacc 
ctgtagacca 
agctcctatt 
gaacagaggg 
ggtgggggtg 
ccccaggctc 
gcaggcccct 
tcacctggga 
ctaggttccc 
attctagata 
ctggctcctg 
ctggtttcct 
ggacatgcct 
gggcagggag 
ggttccaggg 
gcctggcctt 
aaatagctgg 
agaccttgag 
ctggtatgac 
tcccaagtag 
gttaaccaag 
aaaggagaaa 
gtgactcatg 
cgggagtttg 
attagccagg 
aaatcacttg 
agcttgggca 
atgaagaagt 
aagacagcag 
aggaagcaac 
tagagggtaa 
taaacaaagg 
gagaaggaat 
caacctctgc 
acaggcacct 
ccatgttggc 
ccaaagtgct 
ggaagaatgt 
agaagaatgc 
agttagggag 
atgttgttcc 
gtgacatgcc 
aaggctgcag 
ccttcctctg 
gcttttccct 
cattctaccc 
atacttaata 
tcagggaaac 
gagggtagct 
cgggtgacag 
tgtgcactct 
ccactctgtg 
gctggagctg 
taatatgtta 
ggagagctgg 



tttttatatt 
aaaattgttg 
gtgggagggc 
tcagctccac 
tccagattat 
agcaggctga 
aacaattttt 
cggccctgat 
aactgggcac 
agtggctgaa 
gaggttgggg 
gggtcaggga 
cctggctgag 
ctgctccaca 
agttggccgg 
agtcagggct 
gggccacgac 
ggagagagag 
ggggcccagg 
gccagagtcc 
aaaccaaaac 
ccccacccat 
ggtcccccaa 
gtgtagtggc 
cccaggagtt 
agagtgagac 
aaaactgagg 
atggtgatcc 
catgatggtt 
cctgtaatcc 
agatcagcct 
catggtagtg 
aacctgggag 
ataagagtga 
agtcagtcat 
ggctctctgc 
cacccaagag 
ttggatggtc 
agcaggtgat 
ctcgctttgt 
ctcttggttc 
gccaccatgc 
caggctggtc 
gggattacag 
ttcagaatcc 
ttgggaggcc 
gccagagccc 
agggaccttg 
cttgctgccc 
ctcaggaaca 
ctgtgggcgc 
ggtcattctg 
ccaacccctg 
aacagggcac 
ccatgtttat 
gttgagctcc 
tttgccatgg 
gcactggaca 
atctaggtgc 
gcatttaggt 
gatggtgctg 
tgagaggatg 



gggctgaagt 
gtctaaaaac 
cggtggtaga 
tgtgggcccg 
taagtgacat 
agtgcagtgg 
ctgcctcagc 
ttcttaatgt 
actggctgcc 
gttctgaggg 
cgggggcttg 
tgcctcaggt 
ctcaccttag 
ggtggaaaag 
gccttgggga 
gctgcctccc 
agtgtgagca 
gggtgtgagg 
catgcgtggt 
cgagggtgag 
agaatggtgt 
ttgaggggcc 
aagcctgaaa 
atgcacttgt 
tgaagctgta 
tgtctcttaa 
ttgagggtag 
agctgcatat 
tctacggcac 
cagcactttg 
ggccaacatg 
catgcctgta 
gtagaggttg 
aactccatct 
tcaacacatc 
caccatggat 
cattttagag 
agggagggcc 
actcatgtag 
tgcccaggct 
aagcgattct 
ccggctaatt 
tcgaactcct 
gcatgagcca 
caggcctgga 
ggatggaagg 
caggccacac 
gacagtcacg 
aggtgggacc 
tctcccacct 
ccagagagtg 
tgtgtgctgt 
cctggggctc 
tgaaggagaa 
caagctcctg 
ccagtgcccc 
agtgtcggtt 
gtgctcagaa 
tgcagggatg 

gggagagtca 

agtgtcgtga 
gcagttttaa 



ttaagactct 
taatttgaaa 
gtctgaggtg 
gcatggccag 
ttcctgattt 
cacgatctca 
ctccgaagta 
ggcactcatt 
tgcttgtgac 
ctgaggcatt 
cattggaatc 
tatctgcccc 
actcagagcc 
cctaggtcca 
gacccctggc 
tgctccccaa 
catgaaagat 
ccttggcagg 
cacagtccac 
gggaaggaag 
gattgaacca 
ttcaggggaa 
gcagcttact 
agtcctagct 
gtgagctatg 
aaaaaataat 
gaggagaatt 
ttggcttgga 
ctattaagat 
ggagaacgag 
gagaaaccct 
atcccagcta 
cagtgagccg 
caaaaaaaaa 
tgtattgaat 
ttgcatttga 
agcaccaagg 
tcacagagga 
aggtgttttt 
cgagtacagt 
cctgcctcag 
tttgtatttt 
gacctcaagc 
ctgctcctgg 
gggtggaggg 
gaataaaaca 
aaggtcttgc 
agggggtttt 
caagcccgtt 
ccctgcagat 
ccctagagag 
gtaacatcca 
atgcctgact 
gcaggagctg 
ctgtgtgcaa 
agcactgggc 
agtgctgggc 
cacgtggatc 
ggatggagca 
gacaataaat 
agaaaggaag 
atcaggagtc 



ggtctaagta 

ccctcagggc 

aactcctgcc 

agcacctggt 

tttttttgag 

gctcactgta 

gctggaatta 

ataagattgt 

ctctttccag 

cagttcagtg 

tggtactgcc 

aagagtgtgg 

acagtggatg 

gaaagaggct 

aggtcatcca 

ccgcagcctg 

taccaggaag 

aagcccagtg 

agcctagggc 

ggacaggagg 

ggctgggggt 

ctgtgttggg 

atgtgatata 

acttgggagc 

attgcaccac 

aaaagtatta 

caggtatgtc 

gctccctggc 

gaagaagtag 

gcgggcggat 

gtctctacta 

cctgggaggc 

agattgcgcc 

aaaaaaaaaa 

gccaactgta 

gtcgtggaag 

gctatgaaga 

ggtgatgttt 

tttttttttt 

ggtgcgatct 

cctcccaagt 

tagtagagac 

aatccatctg 

ccctcatgta 

gacttgatct 

ttgtggctcg 

aggccgtggg 

cagcaggagg 

tcagacatca 

gtctgcaatg 

tccttcaggt 

ccgtctcccc 

ctgcactggt 

gacgtttgca 

ggtccagggt 

tcttgccttt 

agcatctgac 

cagcaagtgc 

aaagaccaca 

gtaataatta 

ggacagcaga 

aggaaagggc 
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ttactacctg 
atctggggaa 
tatctgtcat 
agcacctgga 
tctcttctct 
taaaatttac 
tagtccattc 
aaaggaaacc 
taatcactca 
ggaatcatac 
acattcatct 
gatattccat 
gctattgtga 
ttcagttatt 
taattttctg 
tgtaccaggg 
ttttttcccc 
ttgcatttac 
gtatatgtca 
tctttttgtt 
ctcctcccat 
agttttgctg 
ccattgccaa 
agttttcact 
caagtccaac 
ctgtttttcc 
tatggattta 

gggttgcatt 

acctccgtct 
caggcgtgtg 
catgttggct 
caaagtactg 
atctatatgt 
ggctttggag 
tttttgagac 
cactgcaacc 
gggattacag 
tctcaccatg 
ctcccaaagt 
taagatcgtt 
ccattttttg 
tggggagttt 
tccgtttatt 
cttgcacctc 
tgaaattgtt 
tatagaaata 
tttcttagca 
tgttaccagg 
ttcaagcgat 
tgcccagcta 
tctcgatctc 
gcatgagcca 
tgtgtgtgtg 
cgtctgtgaa 
ctaattcctc 
tttgtcttgt 
cagctgtggg 
ctagtttgtt 
catcaactga 
aggacccacc 
agagcagttc 
gaaggtgaag 



tgatcacagg 
gggcattcca 
gagttccagt 
gggaccctgg 
tccctgaggg 
ataacaaaat 
acaaagtgct 
ctgtgtcctt 
cctgcattct 
aatatgtgac 
gtgttgtgtt 
tgtaaaacac 
gtagtgttgc 
tggggtatac 
aagaaccatc 
tcccaatttc 
cccagtgtgg 
ctaatggcta 
tttggagaaa 
gagttgtagg 
tctgtgggtt 
aagtccaatt 
acccaaggtc 
tatatttagg 
ttcattgttt 
cctgttgaat 
tttctagact 
ctttcgacag 
cctgggttca 
ctaccatgcc 
aggctggtcc 
ggattacagg 
caataccaca 
caaattttga 
gcagtctcgc 
tccgccttct 
gcgcccggca 
ttggtcaggt 
gctgggatta 
ttggctgttt 
caaaaaaggc 
tgccatctta 
tatgtcttta 
ttggttaaat 
tgaatttcct 
caactgattt 
tttttttctt 
ctggagtgca 
ttttctgcct 
atttttgtat 
ttgacctcgt 
ctgcgcctgg 
tgtgtgtgtg 
gagaggtagc 
tgattggaac 
tcctgatctt 
gttaaatttt 
gagtgatttt 
gagatcgtgt 
taaagcaagc 
ttggtttgaa 
gcttgtggag 



tgacatgtgg 
agcagaagaa 
atagtgtgga 
agagtctcta 
gctcctctct 
tcgccattaa 
gcaaccatca 
taaacacttg 
ctctctatgg 
cttttgtgtc 
gtagcatgta 
tacatttttt 
tgtggacatg 
acctaggagt 
aaggtgatct 
tctacatcct 
ccatcttact 
attaacactg 
tgtttattca 
gttctttata 
gtcttttttt 
tatctttttt 
atgaaggttt 
ccttgataaa 
tgtactcaga 
ggtcttggta 
ctcaattcta 
cccaggctgg 
agcaattctc 
tggctaattt 
tgaattcgtg 
catgtgtgag 
ctattttggt 
aattccagat 
tttgtcgcct 
ggtttcaggt 
ccacgcctag 
tggtctcaaa 
caggcatgag 
gaggtccctt 
cattgggatt 
acaatattcg 
atttctttca 
ctattcccat 
tttaagattg 
ttttgtgttg 
tttttttttt 
gtggcatgat 
cagcctccca 
ttttagtaga 
gatctgccca 
cctgtttctt 
tgtgtgtatt 
ttcctttcca 
ttccagtact 
agacagaggg 
ttaacgcctt 
atcacaaaag 
tttccccttc 
agtgggcgcc 
cctgagggca 
ctgagtagat 



gaagggagtg 
acagcaagtg 
gagaaggaga 
ggggagtgag 
cctttaaaaa 
ccactttaaa 
tctctagttc 
ctccccattt 
atttgcctat 
tggcttatct 
tcagtacttc 
ttatccattc 
tgcatacgag 
agaattactg 
ccacgggggc 
tttcaatgct 
ggatgtgaag 
aggatctttt 
agtcctttgt 
tattctggat 
tgatagtgtc 
tccttttctt 
accgcatgtg 
ttttgagtta 
tatccagtta 
cctttgtaga 
ttcatttttt 
agtacggtgg 
ccatctcagc 
ttgtgtttct 
acctcaagtg 
ccactgcgcc 
actgttactg 
tgtgaggcct 
atgctggagt 
gattctcctg 
ctaatttttc 
ctcctgacct 
ccaccgtgcc 
gagattccat 
ttgacaggaa 
gtctttcaat 
gcaatgtttt 
gcattttatt 
ttcattgctg 
atcttgtatc 
tttttttttt 
ctcggctcac 
agtagctggg 
gatggggttt 
cctcggcctc 
agctttaata 
ctttaggatc 
atttggatgg 
atgttaaata 
ctttcaatat 
ttatcatgtt 
gctattgaat 
tctgcttttg 
ctagaggggt 
gcgggtccgc 
ggggcagtag 



agggagtggg 
caaagatccc 
cacagaccat 
ctcctcttgg 
aaaatttttt 
ctgtacagtt 
caaacatttt 
atccccccaa 
cctggatatt 
cactaagcac 
attccttttc 
attagtttat 
tatttattag 
ggtcacatgg 
tgcaccattt 
tgttattttc 
tggtatctca 
catgtgctga 
ccatttttaa 
attatttaat 
ctttgatgca 
taggtgtcat 
ttttcttcta 
atttttgtat 
tcccagcacc 
aaatcaactg 
tggtttgttt 
ctccatcttg 
ctcccaggta 
tggtagagat 
atttgctcac 
cagccaattc 
tggcttactg 
ccaactttgt 
gcaatggcgc 
cctcagcctc 
tatttttagt 
catgatctgc 
cagccaactt 
gtgaattata 
ttgcattgag 
ccatgaacat 
gtagctttca 
cttttcgatg 
gtatatacaa 
ctacaacttt 
ttttagacag 
tgcaacctcc 
actgcaggtg 
cgccatgttg 
tcaaagtgct 
gttgtgtgtg 
ctctatatat 
cttttattta 
gcagtagtgg 
tttaccattg 
gagggagttc 
tttgtcaaag 
ctccccttct 
tacagcctag 
ctgaggaaac 
gtcccagaga 



tgatgtggtc 
agggcagaac 
agctccatgg 
tctccaactc 
ttaattgtgg 
cagtggcctt 
catcactcca 
gtccccttgg 
tcatataaat 
agcgttttca 
acagcagaat 
aggccttttg 
aatacctgtt 
taattctgtt 
ccaccagtaa 
tggtgttttt 
tggttttaat 
ttggctattt 
aattggcttg 
ttgtaaataa 
caaaaatttt 
atctaagaat 
agagttttat 
atgtgtgagg 
atttgttagg 
gccatagatg 
gtttaagaaa 
gctcactgca 
gctgggacta 
ggggtttcac 
ctcggcctct 
tattcatttg 
tggttattgt 
tctttttttt 
gatctcggct 
ccgagtagct 
agagatgagg 
ctgcctctgc 
tgttcttttt 
gcatcaactt 
taaattgctt 
gggatgtctt 
atggacaaat 
ttattataaa 
taatcagttg 
gctgaatttg 
agtctctctc 
gcctcccagg 
catgccacca 
gccagtgtgg 
ggtattacag 
tgtgtgtgtg 
aacatcatac 
tttttcttgc 
agcaggcatc 
agtataatgt 
ccttctgttc 
gctttttgtg 
actggtagaa 
ctcttccctg 
caggtgtctg 
tatggccagc 
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cccagtcatg 
atagctggct 
ctcccatgct 
atggtggctc 
gctgggcccc 
tttctaatac 
cagagtctga 
gagcgggcag 
ttccatggag 
ggcctggctg 
gcttctgtca 
ggccagcacc 
cgcgtgctgt 
tctgtagcct 
ggggccccag 
caggctggag 
agattctcct 
ctaatttttg 
ctctggacct 
agccaccaag 
gactggtttc 
cagagaagtg 
gggaacttct 
cgagactatc 
gtcccagaga 
ctcgggggct 
tgccctgagc 
gccctcttcc 
gtttatttct 
aaccttctgc 
gcatgtggct 
tttgatgaat 
tggtgctggt 
ccaggagaag 
ggggctcgtc 
tcctgctggc 
caggtccaat 
ccactgcagt 
tcttctgaca 
gcctttagca 
tcccaaacga 
ggagcacacg 
tgtcttccag 
gtgctccttc 
cagccacacg 
tgacctcagg 
cacgcacacg 
acactcatgg 
ccccttccca 
tggtgccccg 
ctaccagagt 
gctggtggcc 
actatgacag 
tgacttaggg 
ggccctctga 
cctgctgcct 
tcctggttgg 
ttggtgcctc 
ctgcacctta 
gggttctctc 
tcacctccca 
cacctggctg 



tcctgctctc 
acatgcaggc 
cacatagtgt 
ttaaacccca 
tcccagagtt 
attctcaagt 
gatggtgcag 
agtgagcagc 
atctctgggg 
ttacaggccc 
ggagaagggc 
tactgtgtgc 
cctcctggag 
cagggagaga 
gtgggagtat 
tgcagtggtg 
ccctcgcctc 
tgtttttaat 
caagtaatcc 
cctggctggg 
cacctctaag 
aattctaaat 
gagcctgtcc 
agggagcctg 
aggtatctgt 
atccctggaa 
tcctcctacc 
ctctcctccc 
caccttggcc 
aatgatggaa 
cttgaaatat 
ttacaatcac 
ctagggtgtt 
gcccaagtgc 
tcgccaacgt 
caagtcccgt 
ctgtggagga 
tttgcagaag 
tctgggggga 
tgctaggatg 
cttgccggtg 
agagaatgcc 
tctgtgtccc 
agcggctcgt 
cggggcctct 
cagggacctt 
gacatacctg 
gtgtgtcctg 
ggccctgtga 
tgggcacctc 
gagggagtga 
atgggcattc 
aagcctcccc 
tgtgtcctgc 
ggttcgtcac 
gctgtctgtc 
tgaacgcaat 
cctccacctg 
ggtctttcca 
tctgggcctt 
ctctcccacc 
ccccaggggc 



tgtggagtcc 
catgcccttt 
gctcattcac 
gcaagtatct 
tctgatccat 
gttgtggatg 
gcgatttcag 
tgagcacaga 
cgtgaatgtc 
ttgtcagtca 
tctggtgcac 
caggcatggc 
ctggcatcct 
agtgctatct 
tttattttat 
cgatcttggc 
ctgagtagct 
ggacaccaga 
gcctacctca 
tgtggggatt 
tcctcatcca 
tcacatagcc 
accccagtcc 
acctgctgga 
cagcagtgca 
gtgttggtca 
tgccacctcc 
ctcacccagg 
actgatgggt 
atgctcagac 
ggagagtgta 
tcgtaagtag 
ggcaaccaca 
cagcctcctc 
tggcacagca 
gcatgctcct 
taccaaggaa 
gttagtgtgt 
gcaaagttag 
tgctgcaaat 
gaagcctcct 
tttctcgtgg 
ctgctggctt 
ttgtttgctc 
gccgggcagt 
cctttctctg 
tgcacacatg 
cagctgtctg 
tgcctccatg 
tccttcccga 
tgccagcttc 
cccagcagtg 
tggtggccag 
cttttgtccg 
ccctctgcca 
attgaacatg 
ggccacactt 
ctccttccag 
catctcaccc 
gcccttcagc 
cctgttctga 
tgacttggcc 



cacagaggct 
ggcgggtggt 
ccagcactgc 
gaaacactgg 
gttgtcttgg 
ctgctggtct 
atgaaccctg 
tgtggatttg 
accacagggt 
tggctctcct 
cagccagaaa 
ctcagcactg 
tttgagggag 
gggaagatga 
ttttttgaga 
tcactgcaac 
gggattacag 
tttcaccatg 
gcctcccaaa 
ttagattaga 
aagccttgtt 
agtggcagaa 
tagcctcacc 
tctgggcagt 
gcacccccca 
gaaagtgaat 
tctgaccaca 
gacccgccac 
ggtttctcct 
ctgctctgtg 
actgaggaac 
ccacctgtgg 
tcactgcctt 
ttcactgccc 
aacacacata 
gggtggctgc 
cctctttgag 
gtgacttaaa 
aatggaatat 
ctccaggagg 
tgaggagtgc 
tttgtgtcca 
cccagggagg 
attcgttcat 
gggatgagtg 
ggtctgtccc 
tatacacaag 
gctgtgctgg 
ttaccgccag 
ccatgagtgg 
ccccgccttc 
tgggcaggct 
ggcctaagcc 
gccctgagtg 
tcacacccat 
ctcgtgtttc 
cccactttcc 
ccaccctctc 
tgtcccaggg 
atgggaagcc 
gctccagtct 
catagagagc 



gacgaggtat 
ggcgtcagtc 
cttaggttgg 
agggcttgtt 
gtagagactg 
gagaaccaca 
caagaggcac 
gaagtgtggc 
tgccctgccc 
gggatgatgc 
aggggatcaa 
tctgcacagc 
atagatgcta 
agccaaggtg 
cagagtttca 
ctccacccct 
gcacctgcca 
ttggccaggc 
gttctgggat 
tgaggaggac 
ttatagatga 
cccagacttg 
cacagtgccc 
cccaccgtgg 
cctgccccac 
ctccagatgt 
tagagcctgc 
tagtccgccc 
agagcggtgc 
cagtccagtc 
caaacttgaa 
ctggcagcca 
gtgcagaaac 
gaagcctgct 
ctttctcctg 
acctggcccc 
gttcccaagt 
aggcaaagag 
ttgctgcaga 
caggcggcat 
tgtgcgagac 
tgctgggctc 
gagggaggct 
ggaaaaccat 
tggtgaacaa 
gcaacataca 
acacatacac 
tcccagctct 
agggcctggg 
gaccctgctc 
agccgccctt 
gggtgcctgg 
atgaggcccc 
gcctggctac 
ccctggccac 
tcccatccta 
tctcatggaa 
tccacctggc 
aagcccttga 
tgcagtccca 
cacttaaacc 
agaacctagt 



gggggccctg 
tggggcagac 
gctccctaga 
ccagcagatg 
ggaatctgca 
tccctagaag 
aggcagtggg 
ctcagcctga 
agaagcatgt 
aggtgaggtg 
cggcatgcat 
agtgagcaga 
atcgggacag 
tgggctccag 
ctctgtcacc 
tgggttgaag 
ccatgcccgg 
tggtcgtgaa 
tacagatgta 
aggcctctct 
gacagaggca 
gaccagtttg 
ttgcccaggg 
catgctgcat 
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cacctggtgg 
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gccactggcc 
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cactgctgca 
gctccgctga 
tgggggctgg 
tgcaccaggt 
gtgtcccatg 
ggcaggcaga 
acttctcaga 
aagccatgct 
ccgtggctgt 
tcggctgcat 
gtgactccat 
ggttccatgc 
gaggagctga 
cacacgcaca 
acacatacat 
tacactccca 
cttgtggaag 
actgccttct 
gccggcctgg 
cacccccagg 
tgctggggcc 
agcacctctt 
cctctccctg 
aaactcctcc 
tgtctgcagc 
ctcctgagca 
tcgtccccag 
acccagccct 
tcagctgtct 
gccgcctctg 
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taccctgctt 
tgggccctct 
cagacccttc 
taattggttc 
ctgccctgac 
gcctgccctg 
agataccttt 
caaatgccat 
accacactca 
ccagtgccca 
tgcctcgcgt 
gttcagctgg 
cactctgtcc 
cagctggaga 
agaggcaatg 
ctcattgtgt 
gtgggtggtc 
gagccctctg 
ggaaccctgg 
gagcttgcca 
tgcttgacca 
gaaccaattt 
tttccgcagt 
cccaacttcc 
ataatggtag 
ttctcacctc 
atattcctca 
ggacatcaga 
ttgctagtat 
aaggcacacc 
ctcgttctgt 
ctcctaggtt 
cgccaccatg 
caggctggtc 
gggattacag 
actgatcctc 
gcaacaaaga 
aaatgtgcca 
tgaggatcta 
gtaaatttgt 
atttcctgga 
tttttttgag 
ctcactgcaa 
ctgggattac 
ggtttcacca 
cggcctccca 
atttctgtga 
gctcatatgt 
aggtgcttat 
ttgcttgggg 
cagagctggc 
gtggatcggc 
tgtggagcga 
agcccggctt 
gtggcagctg 
tggcctggag 
tctgacgtgg 
tacttccggt 
cagcatggtc 
ccaggacagc 
ccaccacctt 
agattggctc 



caggttcacc 
accccttgtc 
ttatctcatg 
tccttgctgc 
gacaatgtgt 
ttccagctgc 
ttcccttgac 
cacccccaga 
tcacactgca 
cctggggcac 
gggtggggag 
gcagccagtg 
tgatcttagt 
tgagggcagt 
cagtgcgagg 
ggccttggga 
tagactaatt 
cacctcctgt 
ccaactagct 
ggagcctggt 

gggagttcat 
ctcacgggtt 
ggtcctttgc 
ttgtgcagag 
gtggggtggg 
ctttgccctg 
aggctcggca 
attctcttct 
ataatatacc 
ccctcaccag 
cgcccaggct 
caagcgattc 
cctggctaat 
ttgaactcct 
gctttgagcc 
catgacttca 
atccccacag 
cacactgcct 
acccagccct 
actttttcct 
agatttcagt 
acagagtctc 
cctctgcctc 
aggcacacgc 
tgttggcgag 
aagtgctggg 
gcaaaacttt 
gatggatgat 
gcactgtaca 
tagggagttc 
tagaagcagt 
tgcctgcgcc 
ggctgggcca 
ggtggaactg 
tgaattcaga 
gttgatgtgt 
gatccttgtc 
ggcagggagc 
tgaagcctgc 
ccacagaggc 
ctcaagggtc 
atgggagggc 



tccaagtgcc 

cctgcatgct 

cttcctctct 

agctagtgca 

gagcctgtgc 

acccaccttc 

ctctgcatct 

aagcctctct 

aataagtgtc 

ctagcaggca 

cagggatgcg 

ccatggatat 

gcagatacct 

gcatcccttt 

gagccagagg 

agatcctcgc 

tgttatccca 

tctgggcaca 

ttaagaaatg 

agggttgtgg 

ccaagggcac 

gcctcagggt, 

tgttgctacc 

gaaggtgctg 

ggggcatggc 

gaatgccctc 

acaatgaccc 

catcgttcct 

tctccaccca 

ttttttcttt 

ggagtgcagt 

tcttgcctca 

ttttgtattt 

gacctcaaat 

accatgccca 

gtgatgaata 

caaaattagg 

agttatttgg 

ggatcactac 

ttagcttagt 

atttagtcta 

actctgtcct 

ctgggttcaa 

cactctgcct 

gctggtcttg 

attacaggcg 

gcctattttc 

aagtactttt 

tctcatatgc 

cttctatacc 

gtttatggaa 

ccctcaccct 

tcaacgtggg 

gcctgaaagg 

agctctggtt 

agcctcctag 

taaggaggtc 

ttcctccctt 

cttgtgtctt 

ttggtcatgt 

cagagggccc 

tgcacgggag 



attaccctca 
gcctgctaat 
agggctgcta 
gcttgggaca 
taggagacca 
tctagatcat 
ggataactcc 
aataaccccc 
tgcaagtgtc 
cttagtaaat 
ttttcagcca 
ttacctggtg 
ttcaggtacc 
tgccaggaag 
ccagggctcc 
tgcctaggcc 
aagcagtcct 
agagggcagc 
cattgtgtaa 
ctctggctct 
ctggaaactg 
ggggaagcgg 
atcacagaaa 
aggccccttc 
tgggctgggg 
ctcccactta 
tttctccaaa 
tctcctatga 
ccaaagcgga 
ctttctttct 
ggtgtgatct 
ggctcctgag 
ttagtagaga 
gatccactca 
gccctaatgc 
agcctccacg 
tttcacattg 
agatagagga 
ctactgatcc 
agaatattac 
ctatatttct 
ccaggctgga 
gtgattctcc 
ggctaatttt 
aactcctgac 
tgagccactg 
cctttgaaag 
attttttcca 
cagccaagct 
cctgccttgt 
tgagtgcatg 
ctgcttgtct 
tgagtgctgg 
gggctggggg 
ttcccaagtc 
gtacctggga 
cccgggtggt 
ccagagagcg 
ccctgaagga 
tgggttgggt 
gtgctcccca 
tctcccttgt 



caggccccag 

acctgctcct 

cttctctatt 

gcaccatcta 

ggccctgtgt 

ggactcactt 

tattcactct 

acccagttct 

ctggcatgag 

atttacaaag 

ggagatggct 

cacttggagg 

gtagaccccc 

gtccgattcc 

cgtcccagct 

tcagtgtccc 

agacctgcac 

caagggcctc 

actgctcttt 

catttctacc 

tcctcaaggc 

aggccaacag 

tcccccccgt 

ggacccagac 

gcccccacac 

gtagttgaac 

agcctttttt 

cctcctattt 

tatcctagca 

tttttttttt 

tggctcactg 

tagctgggac 

cggggtttta 

ccttggcctc 

acccaaaatt 

tctcccccac 

tgtgtgtggt 

atgtttcaca 

cctacagttc 

tgcccatccc 

ttttttgctt 

gtgcaggggt 

tgcctcagcc 

tgtattttta 

ctcaagtgat 

cgcctggccc 

ccatatcaaa 

gtttccttgc 

ggcacttact 

agctcagctc 

aatcagtgaa 

ccaaaggcgg 

gaatgtcctc 

agggcgggag 

accctagcct 

gagactgacc 

tccccagccc 

tgtgccatcc 

ctccacctgt 

gggcacatcc 

gcccccttga 

ccctgtcatt 



acccgacacc 
cttaccaccc 
cctgttcccc 
tggttcccta 
gataagctca 
ctctgcccac 
tcacctcctg 
cctcttcatc 
aatgggccct 
tgagtggctc 
tggggtttgg 
tcacagggca 
ccagcctcag 
caatggacaa 
ctgtcagtga 
cttctgtaca 
tgctgacttg 
agaacgctga 
actgagccca 
aaaggaagtg 
atttcccggg 
cccctgtctt 
tatcttcctc 
aggaggaacc 
cccagggtcc 
agaatcctaa 
ccccatcttg 
gttaccgtaa 
ctatggcttt 
tgagtagagt 
caacctctgc 
tacaggtgtt 
ccatgttggc 
ccaaagtact 
aagatggaga 
tgcgggtgtg 
ttttttaaaa 
tgcaaatgta 
tgttatgttt 
caaaactatg 
tttttttttt 
gtgaccttgg 
tcccgagtag 
gtagagacgg 
ccgcctgcct 
agtctactgt 
attattgtca 
acaatttcaa 
tcctggactg 
atccttcccc 
tgaatgactg 
ggaagctggc 
gggaatgtcc 
gatcctggag 
ccttgtggag 
agtgcctcca 
cctctttgcg 
ttgggcagct 
gtcctggggc 
tgggtcaata 
atctcccaca 
gtccctcctg 



17649165 

17649105 

17649045 

17648985 

17648925 

17648865 

17648805 

17648745 

17648685 

17648625 

17648565 

17648505 

17648445 

17648385 

17648325 

17648265 

17648205 

17648145 

17648085 

17648025 

17647965 

17647905 

17647845 

17647785 

17647725 

17647665 

17647605 

17647545 

17647485 

17647425 

17647365 

17647305 

17647245 

17647185 

17647125 

17647065 

17647005 

17646945 

17646885 

17646825 

17646765 

17646705 

17646645 

17646585 

17646525 

17646465 

17646405 

17646345 

17646285 

17646225 

17646165 

17646105 

17646045 

17645985 

17645925 

17645865 

17645805 

17645745 

17645665 

17645625 

17645565 

17645505 



WO 03/014340 



-50- 



PCT/EP02/08654 



gaggcacagc 
gagacgtagt 
gcaacctccg 
ttacaggcat 
ccatgttggt 
gcctcccaaa 
ttatttgagc 
gatgcgtact 
ctgcagggga 
gctgggtctc 
atgttatggt 
gggttccgag 
taagagcctg 
gtgaccctgg 
gatggtgatg 
agtcaggtgt 
ggtgggagag 
ttgttgggct 
gggggtatct 
cccgggctct 
cctgaccacc 
ctggagtggg 
gcttctggaa 
gggacaccct 
agcttagttt 
gctgggggag 
cagcaccgct 
cgggggcttc 
tggggtctcg 
tcaccctgaa 
ctcagccagg 
agggttgcct 
gggtcaggat 
agggagtagg 
aacccgcgct 
gaggtgaggg 
ccacaactcc 
tgttcctatg 
aggagaaggg 
gagggggcat 
gggctgtggg 
gggcctgagt 
ctcctccagg 
agctgcacag 
attaggcccc 
ctgtccgccc 
tggagggcat 
aggggctgga 
agtctccctc 
aaatgcaggg 
agggtcctcc 
gatgggactg 
ggggcactgc 
ccacacacat 
agtaaagtgt 
ctggtcagac 
ctgtggcttt 
caggtcacat 
ccaccacaga 
gcactgggct 
ccagctctgc 
tggcaaatgg 



acttgacaat 
ttcactcttg 
cctcccaggt 
gcaccaccat 
caggctgggt 
gtgctaagat 
tgggcataat 
ggatcgtgta 
gagtccacct 
ttccatcttg 
actctgctgc 
gagggcctgg 
gatcctgatc 
gcaagttgct 
ctaatgcccc 
tcactgtggc 
gcaagttcct 
caatttcaga 
atgggcccta 
agggccctcc 
cgctttggcc 
gagccacatg 
tgatgagcta 
gaagatgtgg 
gccaacccag 
gtgaagggat 
ctctgccctt 
tgtgcctatg 
cctccaagag 
ttatagacaa 
ctggaactga 
gtggtggaag 
ctgctgtata 
gagtggaaag 
gtgctccagg 
catgcaaata 
cctgccactc 
tgtgcctgct 
ggaagcacag 
ttaggacttg 
gattctctcc 
gacccactta 
gagccttccc 
ctctccctgg 
cctccccgcc 
cagatcctca 
ctccagggct 
ctcttagggg 
ctcatgtccc 
tctgtctttg 
ctccctcctc 
ctctcgtgtg 
cccctgccca 
ggcttggctc 
caaagcagag 
agggagggga 
gtgctagttg 
gtggacagtc 
cttgagaggg 
tgctcaagcc 
cccttaacag 
ggagtttcct 



ttacaaagct 
ttgcccaggc 
tcaaacaatt 
gcccggctaa 
cttgaactcc 
tacagacatg 
tgtcaggcct 
gagccctaaa 
tgggccactg 
gtattaccag 
acaaacccgg 
atggccctcc 
caccctgggt 
gacctcctct 
tcctcgggct 
tgtcttctca 
ggtatccatg 
gggcctctgc 
gggctctgag 
tcgtggctgt 
actggcagaa 
gagccaggcc 
tctgttgctt 
ctgccttctg 
agtccgaggc 
ggaggagctc 
ccgcaggggg 
cggacatcac 
ccctcctgga 
ggggcctatg 
atccagatct 
ggagttattc 
gccacatatt 
aatgatggtg 
acttactccc 
gcaggggcag 
tgtgcctggc 
ctcatggggc 
gagaccttcc 
gccttgtagg 
ttaaccccta 
cctcctgacc 
tgaccacctc 
caagtgacat 
cctcgccagc 
tgtctaccct 
accatcattg 
acctgccacc 
cacggctctc 
tcactctgtc 
ctgactgccc 
cagagctctg 
gagctgctga 
gggcacctgg 
ggttacctcc 
cccagtgggc 
gggccctgcc 
ctttacagtt 
tggcagagcg 
catgctagaa 
tcatgagacc 
gaggggtggg 



ctttttcacc 

tggagtgcaa 

ctcctgcctc 

ttttgtattt 

cgacctcagg 

agccaccacg 

gtctcactga 

gcagggtccc 

cacagttgag 

gtgcctagca 

gagtgatctg 

ccatggcagg 

ttgatcctgg 

gtgggtcagt 

ggagggagtc 

tcattaggag 

gggccagctg 

aattcaggcc 

gctgtgtctc 

aggcagtcat 

tccgtgtggc 

cagcttggtg 

aggggtgtga 

gcctggggat 

acaggttcct 

ctggactgag 

tggcttccac 

gctcgccatc 

atcctcccca 

ctggagcagg 

gacacttgct 

cagcctccca 

aagttgtagg 

ctgatagcac 

tgaatcctcg 

aattggcgct 

cttgtgctgg 

acatagactg 

tggggaggag 

ataaggcaga 

cacatttcct 

tatgaccctt 

ctgcatgcac 

ctttgctggg 

cagctgggag 

ctcctccctg 

atcttgatgc 

cccagttcca 

acggcttctg 

caggacagcg 

ccacatgagg 

ctgtgggtcc 

gcactggcca 

ggtcaccatt 

tcctcaggac 

tccggaaggc 

ttgggttctt 

tgcttttcac 

ggatttcttc 

ggtgtcgggg 

tgctgccccc 

tgggtggcag 



aggctctttt 

tggcgcgatc 

agcctcctga 

ttagtagaga 

tgatacgccc 

cccggccttc 

tgaggaaatg 

ccagcctttg 

gggagcccca 

ttcagtctgg 

tgccctgcgt 

tgttactgcc 

ttctgccatt 

ctcctcatct 

ttcagcaagc 

ccaacagtag 

cacactgtct 

atcccagggg 

agggttgagg 

gaccagcaga 

ccccatacca 

gggacaagga 

gtggcactga 

ggtgacatgc 

gagagctgag 

cctgggagcc 

cactgctcca 

aaggtgtgtc 

tagctccaaa 

gagggggctt 

cctcttccat 

cagagccagg 

aagaagggca 

ctggcagttc 

cagacagaca 

ggcctctggt 

gcatcaggaa 

atggggggaa 

ggaatgaagg 

ggttggggac 

agggaatctg 

cagggcacag 

acatggagcc 

tggcctgatt 

ttgctgtagg 

gcagtttctg 

ccatcaggtg 

gaatcttccc 

tcttctgtct 

ggtcctcctc 

ctcttcctga 

ccattgctta 

cctgcccctc 

taagaactcg 

ccctaatgag 

acccccctgc 

gcgaccccga 

atccctgatc 

ctctgatagg 

cctggtttta 

gagagcaggc 

agccccagcc 



tttctttttc 
tcggctcacc 
gtagctgaga 
cagggtttct 
acctcgctcg 
acccagactc 
gccatggaaa 
gctctgaact 
ctctgcaggg 
catagtaatg 
gtctacagca 
tggtagaggt 
acctggctgt 
gtaaaatggg 
tcagttgctc 
cctcctgggg 
gacggagcag 
ctgcagggga 
ggtgatggat 
gggtgccctt 
ccactccttc 
gcagctttct 
ggacttgctg 
cccagcactc 
cagggaggat 
tggctctgag 
gcgaccgtgg 
tatgagcaag 
ttaactgttc 
gtttgggttg 
gttgcttaga 
ggactagaga 
tggctggcaa 
tgcatgctcc 
ggggcccaca 
ctgtggggcc 
ctgactgacc 
gcaggccatt 
cttcctggaa 
tgaagtccca 
ggaaaatcca 
gacatgcccc 
ccacagctgg 
acccacaagc 
gctgggtcct 
tttgagcgtg 
agtgccctgc 
ggggcaggag 
ctcgggctac 
attgctcccg 
agcccactct 
tgaataattt 
aggcggatgc 
gcgcctaggg 
gccagtgcct 
accattactg 
actcctgagc 
ccaaccagtc 
gaacctaaga 
aggttgaatc 
cgtgctgccc 
ttgcctaggg 
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cacctacccg 
catggacgac 
ccgctttgcc 
aggctctctc 
tagcatccct 
gtcactcgac 
aaacaggaaa 
gaaataaatt 
ggctgaagtg 
aaaccccatc 
ccagctactc 
taagccgaga 
aagaaagaaa 
aaaggcagag 
catttttgca 
ggaaccaatt 
gtgtccatgt 
atctccacac 
ctttttgcag 
gctgaatctg 
cagtttccct 
gaaggtggag 
catcaagaaa 
catcctcgag 
ttggggccac 
catgtcaggg 
aggacttcct 
gggcatcgtg 
ccttatggtg 
acttaatctg 
ctcagacaca 



agagcggcta 
aagcgtgtgt 
aagcgtaagc 
ctgagtgtct 
gtgaggtgat 
ccacccaaga 
gaatgaaaga 
cacataggct 
ggcggatcac 
tctactaaaa 
gggaggctga 
tcgcgccatt 
gaaattcatg 
actctcagat 
ccagtcacga 
tctgaacaga 
gtgtaggcaa 
ccccagggca 
ctcttggtgc 
acagaccagt 
tctctataaa 
ctggagtggg 
tccctccagg 
ggggaccgcc 
gggagggtct 
aggagatgga 
gacaccatgg 
aagcgggatg 
acctcaggcg 
tttggcctgg 
ccgctgcttc 



ctgtgacctc 
acatcatgga 
tgctgcccct 
cctgtctgct 
cctttccatt 
tcacataacc 
aaaaaaagaa 
gggcgcggtg 
ctgaggtcgg 
atacaaaatt 
gacaggagaa 
gcactccagc 
tataatcgtt 
gagatttaaa 
tgagtctggt 
acctcacatg 
gacccagagg 
gtgtctcagc 
tcttttcacc 
ttccagtctt 
ttgaggccat 
gcacagagga 
agcacctgcc 
ttggggggct 
gctctatgga 
ctgaagcaac 
gggtctggcc 
agctggtgtt 
ggtaccagaa 
ggctcattgg 
cccctgcagt 



cccacagggc 
tgtctacaac 
accctcatct 
aggccctgca 
ttacagatga 
cttacaataa 
aaataggata 
gctcacgcct 
gagtttgaga 
agctggatgt 
ttgcttgaac 
ttgggcaaca 
aaaatgaaaa 
aacagggctg 
gtggataagt 
tgctgagcct 
aggcagtgaa 
ttcagtgccc 
ttagttttgg 
gcctggtgtc 
ccatgtctct 
tgatgagtac 
cgacgtggtg 
gtccatcagc 
ctcagcagca 
agcagtttgg 
tgcctgagtc 
ccggatggtc 
gcgcacagcc 
gcctgagtca 
gccc 



aatgggcatg 
cgccacatct 
tgggtgtgtc 
gaagccactg 
ggaaaccgag 
acatgcattt 
aatttgaaaa 
gtaatcccag 
ccagcctgac 
ggtggcgcat 
ctgggaggcg 
agagcgaaac 
tgcattaaac 
ccacctttgc 
cagcagctag 
gggcttaagg 
atctgacatt 
cttctctcct 
gtggaatgag 
cacagtcttg 
ctcccagagg 
ctggataagg 
gtatacaatg 
ccagcggtac 
gcaggaaagg 
agcagggcta 
accctcctct 
cgtggccgcc 
cgcatcattg 
cccagcgtct 



agcgagactt 
acccagggga 
cttgtggatg 
cagtggttca 
acctggagaa 
gtctggcaaa 
tacgaaataa 
cactttggga 
caacatggag 
gcctgtaatc 
gaggtttcgg 
tccatctcga 
tcatcaatca 
aggtagggga 
tatggcccaa 
gcagggcagg 
gccgacacag 
ttgagtcccc 
gctgagcagt 
tcctgagcct 
ccatcaggcg 
tggagaggaa 
caggcaccga 
gtcctgaccc 
tgggcggcct 
gccctgcagc 
tcccctaaca 
gggtgcccat 
ctgactccat 
ccgcacagaa 
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SEQUENCE LISTING 



<110> Novartis AG 

<120> Histone Deacetylase-Related Gene and Protein 

<130> Case 4-32094A 

<160> 4 

<170> Patentln version 3.0 

<210> 1 

<211> 347 

<212> PRT 

<213> Homo sapiens 

<400> 1 

Met Leu His Thr Thr Gin Leu Tyr Gin His Val Pro Glu Thr Pro Trp 
15 10 15 

Pro lie Val Tyr Ser Pro Arg Tyr Asn lie Thr Phe Met Gly Leu Glu 
20 25 30 

Lys Leu His Pro Phe Asp Ala Gly Lys Trp Gly Lys Val lie Asn Phe 
35 40 45 

Leu Lys Glu Glu Lys Leu Leu Ser Asp Ser Met Leu Val Glu Ala Arg 
50 55 60 

Glu Ala Ser Glu Glu Asp Leu Leu Val Val His Thr Arg Arg Tyr Leu 
65 70 75 80 

Asn Glu Leu Lys Trp Ser Phe Ala Val Ala Thr He Thr Glu He Pro 
85 ' 90 95 

Pro Val He Phe Leu Pro Asn Phe Leu Val Gin Arg Lys Val Leu Arg 
100 105 110 



Pro Leu Arg Thr Gin Thr Gly Gly Thr He Met Ala Gly Lys Leu Ala 
115 120 125 
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Val Glu Arg Gly Trp Ala He Asn Val Gly Gly Gly Phe His His Cys 
130 135 140 

Ser Ser Asp Arg Gly Gly Gly Phe Cys Ala Tyr Ala Asp lie Thr Leu 
145 ISO 155 160 

Ala He Lys Phe Leu Phe Glu Arg Val Glu Gly He Ser Arg Ala Thr 
165 170 175 

He He Asp Leu Asp Ala His Gin Gly Asn Gly His Glu Arg Asp Phe 
180 185 190 

Met Asp Asp Lys Arg Val Tyr He Met Asp Val Tyr Asn Arg His He 
195 200 205 

Tyr Pro Gly Asp Arg Phe Ala Lys Gin Ala He Arg Arg Lys Val Glu 
210 215 220 

Leu Glu Trp Gly Thr Glu Asp Asp Glu Tyr Leu Asp Lys Val Glu Arg 
225 230 235 240 

Asn He Lys Lys Ser Leu Gin Glu His Leu Pro Asp Val Val Val Tyr 
245 250 255 

Asn Ala Gly Thr Asp He Leu Glu Gly Asp Arg Leu Gly Gly Leu Ser 
260 265 270 

He Ser Pro Ala Gly He Val Lys Arg Asp Glu Leu Val Phe Arg Met 
275 280 285 

Val Arg Gly Arg Arg Val Pro lie Leu Met Val Thr Ser Gly Gly Tyr 
290 295 300 

Gin Lys Arg Thr Ala Arg He He Ala Asp Ser He Leu Asn Leu Phe 
305 310 315 320 



Gly Leu Gly Leu He Gly Pro Glu Ser Pro Ser Val Ser Ala Gin Asn 
325 330 335 



Ser Asp Thr Pro Leu Leu Pro Pro Ala Val Pro 
340 345 
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<210> 2 

<211> 1755 

<212> DNA 

<213> Homo sapiens 

<400> 2 

agctttggga gggccggccc cgggatgcta cacacaaccc agctgtacca gcatgtgcca 60 

gagacaccct ggccaatcgt gtactcgccg cgctacaaca tcaccttcat gggcctggag 120 

aagctgcatc cctttgatgc cggaaaatgg ggcaaagtga tcaatttcct aaaagaagag 180 

aagcttctgt ctgacagcat gctggtggag gcgcgggagg cctcggagga ggacctgctg 240 

gtggtgcaca cgaggcgcta tcttaatgag ctcaagtggt cctttgctgt tgctaccatc 300 

acagaaatcc cccccgttat cttcctcccc aacttccttg tgcagaggaa ggtgctgagg 360 

ccccttcgga cccagacagg aggaaccata atggcgggga agctggctgt ggagcgaggc 4 20 

tgggccatca acgtgggggg tggcttccac cactgctcca gcgaccgtgg cgggggcttc 4 80 

tgtgcctatg cggacatcac gctcgccatc aagtttctgt ttgagcgtgt ggagggcatc 540 

tccagggcta ccatcattga tcttgatgcc catcagggca atgggcatga gcgagacttc 600 

atggacgaca agcgtgtgta catcatggat gtctacaacc gccacatcta cccaggggac 660 

cgctttgcca agcaggccat caggcggaag gtggagctgg agtggggcac agaggatgat 720 

gagtacctgg ataaggtgga gaggaacatc aagaaatccc tccaggagca cctgcccgac 780 

gtggtggtat acaatgcagg caccgacatc ctcgaggggg accgccttgg ggggctgtcc 840 

atcagcccag cgggcatcgt gaagcgggat gagctggtgt tccggatggt ccgtggccgc 900 

cgggtgccca tccttatggt gacctcaggc gggtaccaga agcgcacagc ccgcatcatt 960 

gctgactcca tacttaatct gtttggcctg gggctcattg ggcctgagtc acccagcgtc 1020 
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tccgcacaga actcagacac accgctgctt ccccctgcag tgccctgacc cttgctgccc 1080 

tgcctgtcac gtggccctgc ctatccgccc cttagtgctt tttgttttct aacctcatgg 1140 

ggtggtggag gcagccttca gtgagcatgg aggggcaggg ccatccctgg ctggggcctg 1200 

gagctggccc ttcctctact tttccctgct ggaagccaga agggcttgag gcctctatgg 1260 

gtgggggcag aaggcagagc ctgtgtccca gggggaccca cacgaagtca ccagcccata 1320 

ggtccaggga ggcaggcagt taactgagaa ttggagagga caggctaggt cccaggcaca 1380 

gcgagggccc tgggcttggg gtgttctggt tttgagaacg gcagacccag gtcggagtga 1440 

ggaagcttcc acctccatcc tgactaggcc tgcatcctaa ctgggcctcc ctccctcccc 1500 

ttggtcatgg gatttgctgc cctctttgcc ccagagctga agagctatag gcactggtgt 1560 

ggatggccca ggaggtgctg gagctaggtc tccaggtggg cctggttccc aggcagcagg 1620 

tgggaaccct gggcctggat gtgaggggcg gtcaggaagg ggtacaggtg ggttccctca 1680 

tctggagttc cccctcaata aagcaaggtc tggacctgca aaaaaaaaaa aaaaaaaaaa 1740 

aaaaaaaaaa aaaaa 1755 

<210> 3 

<211> 1044 

<212> DNA 

<213> Homo sapiens 

<400> 3 

atgctacaca caacccagct gtaccagcat gtgccagaga caccctggcc aatcgtgtac 60 

tcgccgcgct acaacatcac cttcatgggc ctggagaagc tgcatccctt tgatgccgga 120 

aaatggggca aagtgatcaa tttcctaaaa gaagagaagc ttctgtctga cagcatgctg 180 

gtggaggcgc gggaggcctc ggaggaggac ctgctggtgg tgcacacgag gcgctatctt 240 
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aatgagctca agtggtcctt tgctgttgct accatcacag aaatcccccc cgttatcttc 300 

ctccccaact tccttgtgca gaggaaggtg ctgaggcccc ttcggaccca gacaggagga 360 

accataatgg cggggaagct ggctgtggag cgaggctggg ccatcaacgt ggggggtggc 420 

ttccaccact gctccagcga ccgtggcggg ggcttctgtg cctatgcgga catcacgctc 480 

gccatcaagt ttctgtttga gcgtgtggag ggcatctcca gggctaccat cattgatctt 540 

gatgcccatc agggcaatgg gcatgagcga gacttcatgg acgacaagcg tgtgtacatc 600 

atggatgtct acaaccgcca catctaccca ggggaccgct ttgccaagca ggccatcagg 660 

cggaaggtgg agctggagtg gggcacagag gatgatgagt acctggataa ggtggagagg 720 

aacatcaaga aatccctcca ggagcacctg cccgacgtgg tggtatacaa tgcaggcacc 780 

gacatcctcg agggggaccg ccttgggggg ctgtccatca gcccagcggg catcgtgaag 840 

cgggatgagc tggtgttccg gatggtccgt ggccgccggg tgcccatcct tatggtgacc 900 

tcaggcgggt accagaagcg cacagcccgc atcattgctg actccatact taatctgttt 960 

ggcctggggc tcattgggcc tgagtcaccc agcgtctccg cacagaactc agacacaccg 1020 

ctgcttcccc ctgcagtgcc ctga 1044 

<210> 4 

<211> 23434 

<212> DNA 

<213> Homo sapiens 

<400> 4 

ctacacacaa cccagctgta ccagcatgtg ccagagacac gctggccaat cgtgtactcg 60 

ccgcgctaca acatcacctt catgggcctg gagaagctgc atccctttga tgccggaaaa 120 

tggggcaaag tgatcaattt cctaaaaggt atggaaggtc ccccttggac tctcatctgc 180 
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ttcctccaac ccacctgtcc tctccgtcct catccccaac ataagcctca ggctctctcc 240 

catcttcagt ttcagccctc ggatggcctt ccacccatgc ttccgcccaa aatgattttt 300 

ccaacacaga ctcctaatca cgatatgatg tccctgactc agactctccc tggctcccca 360 

tcctgtgggc ctaagtcctg cctctgccca agaggcctag tggaaaggta gctgattact 420 

gatgggcaca gggaaggtga agcttggagg agtccatttc ctaaggttca gagagtcagg 480 

aggtagagca cctccaccgc acctctcttg attacagatg ggggaaattg tgtcctagaa 540 

tgattaggaa acatgtgcac ccaattccag tccagtcctc acagcagccc tcggggtagg 600 

caccacaatc gcagcagagg ctcaggagct cactgtaacc tccgcctttc aggttcaaac 660 

aatttttctg cctcagcctc ccaagtagct ggaattacag gcgtgagcca ccacacccgg 720 

ccctgatttc ttaatatggc actcattata agattgtaaa agcccacctg tagaccgaac 780 

tgggcacact ggctgcctgc ttgtgacctc tttccaggga aggacacagc tcccattagt 840 

ggctgaagta acacagttac aagaggcgga gttgggtttg gaactcagag ctccaggcgc 900 

cctaccttta gggctcatcc ccttgagcaa aatgatgctt cgaagagcat atcgttttaa 960 

ctgtggtttg taatcaaggg gcctgattta ggtgggaaat tcacttaaac ttgttttaaa 1020 

aggaaacatt atgtcatcaa aatgggaaaa ggcagtttca cttgccataa ataggtcatg 1080 

gtaaaaaagt aaatgcaatg aaaacaacag tataattcaa tccaggctgg ttactattgc 1140 

ctgcaggctg tgagactgat tagtggtttg aacggaagat gagcaaagca caggcaggtg 1200 

ttgcgaggcc atgccacact gagcctcctg taatatcatc agaaggtgga gggaggccgg 1260 

gcgcagtggc tcgtgcctgt aatcccagca ctctgggagg ccaaggctag gagaacactt 1320 

gaggccggga gtttgagacc agcttgggca acatagcaag atcctgtctc tacaaaataa 1380 
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aaataaaaaa cttagctggg ggtggtggta tgcacctata gtcctagcta cttgaaatgc 1440 

tgaggcagga ggatcacttg agcccagaag ttcgagggtg cagtgagcta tggttgtgcc 1500 

actgcactcc agcctgggct acagagcaag accttgtctt gcatttattt atttgtttat 1560 

ttatttattg agacagggtc tcactcccat cacccaggct agagtgcagt ggcggaatca 1620 

aggctcattg ccacctcaac ctccctggct taagtgatcc tcccacctca tgtttttgat 1680 

attttgtaga gatgcggtct cactatgttg tctaggctgg tcttgaactc ctgggctcaa 1740 

gcaatccacc tgcctcaacc tcccaaagtg ctgggattac aggcgtgaac caccacacct 1800 

ggccaagacc ctgtctcttt aaatgaatta aaaaaaaaaa aagggcgggg ggaaggtgga 1860 

gggggaattc ctaagaagag tttttctcac tctgagggtc aacatccctg acccttgtgc 1920 

cacctgctcc tgaaggttgt ctagcacacc tgagctctcc ttgtgactat cagtggcttg 1980 

ggaaacatgg ggattgctgt gtgtacgatg ttcattgctc cctggccaga gggactggcc 2 040 

actgtccaca gtggctgggg aggctacccc ttctcagaag gcccacaagc cagcagtgcc 2100 

tacctacccc tggggcaggg gctgccacag gccaagtctg cagcctgtgg gagggtctgg 2160 

ggctggccct ggccttgagg tcagtgggga agcaggatgc tccctctgtg gtttcagaag 2220 

agaagcttct gtctgacagc atgctggtgg aggcgcggga ggcctcggag gaggacctgc 2280 

tggtggtgca cacgaggcgc tatcttaatg agctcaaggt acaggatgtc gggcctgggg 2340 

ggctgcgggc ctggggcagg gggctgctgg ccaggagtgg ccagaggcag gaggtgactc 2400 

agcctgggga agccaagtct cacagggcac ccattcatgt ccctagtgtt ggaggaacat 2460 

gggagtctgt ggtccccaag agaaggagag aggtcataaa aaggcagacc tcagtttggg 2520 

ccaggccact ctgagggtgg tgtcctcccc ttctccaggg cgtatgaaag ccttcataga 2580 

attttaggct tctacattat gactttcaag ctgtgctctg tcgacacgcc tccgagaccc 2640 
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cagcccctgt cctccaacca tacatagctc tttcactttg gtctattttg tttgtttgtt 2700 

tgttcatttt tttgttggtt tttgttcttg aaatggagtc tcactctgtc gcccaggctg 2760 

gagtgcagtg atgtgatctc ggctcattgc aacctccgcc ttccgggttc aagcaattat 2820 

cctgtctcac cctcctgagt gagtagctag gattacaggc gcgtgccacc atgcctggct 2880 

aatttttttt gtattttagt agagatgtgg tttcgccgtg ttggccaagc tggtctcgaa 2940 

ctcctgaact caggtcatct gcccacctcg gcctcccaaa gtgctggggt tacaggcgtg 3000 

agccaccgca cccaccctat tttttatatt gggctgaagt ttaagactct ggtctaagta 3060 

cttctgctga agttttgttg aaaattgttg gtctaaaaac taatttgaaa ccctcagggc 312 0 

tcagcagaga agagaaacaa gtgggagggc cggtggtaga gtctgaggtg aactcctgcc 3180 

ccttcccaag gggcggctcc tcagctccac tgtgggcccg gcatggccag agcacctggt 3240 

cttcaaagag aagccaggaa tccagattat taagtgacat ttcctgattt tttttttgag 3300 

actgagtctc gctcttgttg agcaggctga agtgcagtgg cacgatctca gctcactgta 3360 

acctccgcct cccgggttca aacaattttt ctgcctcagc ctccgaagta gctggaatta 3420 

taggggtgag ccaccacacc cggccctgat ttcttaatgt ggcactcatt ataagattgt 3480 

aaaagcccac ctgtagacca aactgggcac actggctgcc tgcttgtgac ctctttccag 3540 

agaaggacac agctcctatt agtggctgaa gttctgaggg ctgaggcatt cagttcagtg 3600 

ctctttgtag gaacagaggg gaggttgggg cgggggcttg cattggaatc tggtactgcc 3660 

agcctgcctt ggtgggggtg gggtcaggga tgcctcaggt tatctgcccc aagagtgtgg 3720 

gagccctgac ccccaggctc cctggctgag ctcaccttag actcagagcc acagtggatg 3780 

cctgaggcca gcaggcccct ctgctccaca ggtggaaaag cctaggtcca gaaagaggct 3840 
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gtgctcaagg tcacctggga agttggccgg 
gtccagtctt ctaggttccc agtcagggct 
aggtgtgaga attctagata gggccacgac 
aggttgaaac ctggctcctg ggagagagag 
cttggctgcc ctggtttcct ggggcccagg 
tgggccagga ggacatgcct gccagagtcc 
cgctcagctg gggcagggag aaaccaaaac 
ggggggccta ggttccaggg ccccacccat 
caggctgcat gcctggcctt ggtcccccaa 
taataataca aaatagctgg gtgtagtggc 
ctgaggcagg agaccttgag cccaggagtt 
tgcactccag ctggtatgac agagtgagac 
acaggtagag tcccaagtag aaaactgagg 
cactgaaaaa gttaaccaag atggtgatcc 
agtcagaaca aaaggagaaa catgatggtt 
gccgggtgca gtgactcatg cctgtaatcc 
cacttgaggt cgggagtttg agatcagcct 
aaactacaaa attagccagg catggtagtg 
tgaggcagga aaatcacttg aacctgggag 
attgcactcc agcttgggca ataagagtga 
aaagaaaaag atgaagaagt agtcagtcat 
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gccttgggga gacccctggc aggtcatcca 3900 

gctgcctccc tgctccccaa ccgcagcctg 3960 

agtgtgagca catgaaagat taccaggaag 402 0 

gggtgtgagg ccttggcagg aagcccagtg 4080 

catgcgtggt cacagtccac agcctagggc 4140 

cgagggtgag gggaaggaag ggacaggagg 4200 

agaatggtgt gattgaacca ggctgggggt 4260 

ttgaggggcc ttcaggggaa ctgtgttggg 4320 

aagcctgaaa gcagcttact atgtgatata 4380 

atgcacttgt agtcctagct acttgggagc 4440 

tgaagctgta gtgagctatg attgcaccac 4500 

tgtctcttaa aaaaaataat aaaagtatta 4560 

ttgagggtag gaggagaatt caggtatgtc 4620 

agctgcatat ttggcttgga gctccctggc 4680 

tctacggcac ctattaagat gaagaagtag 4740 

cagcactttg ggagaacgag gcgggcggat 4800 

ggccaacatg gagaaaccct gtctctacta 4860 

catgcctgta atcccagcta cctgggaggc 4920 

gtagaggttg cagtgagccg agattgcgcc 4980 

aactccatct caaaaaaaaa aaaaaaaaaa 5040 

tcaacacatc tgtattgaat gccaactgta 5100 
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cagagagaat aagacagcag ggctctctgc caccatggat ttgcatttga gtcgtggaag 5160 

attaaaatta aggaagcaac cacccaagag cattttagag agcaccaagg gctatgaaga 5220 

aagtgaaaaa tagagggtaa ttggatggtc agggagggcc tcacagagga ggtgatgttt 5280 

gagttgagac taaacaaagg agcaggtgat actcatgtag aggtgttttt tttttttttt 5340 

tttttttttg gagaaggaat ctcgctttgt tgcccaggct cgagtacagt ggtgcgatct 5400 

cagctcacag caacctctgc ctcttggttc aagcgattct cctgcctcag cctcccaagt 5460 

agctaggatt acaggcacct gccaccatgc ccggctaatt tttgtatttt tagtagagac 5520 

ggagttttca ccatgttggc caggctggtc tcgaactcct gacctcaagc aatccatctg 5580 

cctcggcctc ccaaagtgct gggattacag gcatgagcca ctgctcctgg ccctcatgta 5640 

tagctttgaa ggaagaatgt ttcagaatcc caggcctgga gggtggaggg gacttgatct 5700 

tccaaagggg agaagaatgc ttgggaggcc ggatggaagg gaataaaaca ttgtggctcg 5760 

tacacggtgc agttagggag gccagagccc caggccacac aaggtcttgc aggccgtggg 5820 

aggagtgtat atgttgttcc agggaccttg gacagtcacg agggggtttt cagcaggagg 5880 

gtgatatggt gtgacatgcc cttgctgccc aggtgggacc caagcccgtt tcagacatca 5940 

tctggcacct aaggctgcag ctcaggaaca tctcccacct ccctgcagat gtctgcaatg 6000 

tttcttttct ccttcctctg ctgtgggcgc ccagagagtg ccctagagag tccttcaggt 6060 

ttctcaggct gcttttccct ggtcattctg tgtgtgctgt gtaacatcca ccgtctcccc 6120 

tgcctcatcc cattctaccc ccaacccctg cctggggctc atgcctgact ctgcactggt 6180 

gtggcctttg atacttaata aacagggcac tgaaggagaa gcaggagctg gacgtttgca 6240 

agatgtcaat tcagggaaac ccatgtttat caagctcctg ctgtgtgcaa ggtccagggt 6300 
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tggcccctct gagggtagct gttgagctcc 
ggttgattct cgggtgacag tttgccatgg 
atcctgcccc tgtgcactct gcactggaca 
tcagagggca ccactctgtg atctaggtgc 
tccctttcct gctggagctg gcatttaggt 
agtaatgaga taatatgtta gatggtgctg 
aaaggggtgg ggagagctgg tgagaggatg 
ttactacctg tgatcacagg tgacatgtgg 
atctggggaa gggcattcca agcagaagaa 
tatctgtcat gagttccagt atagtgtgga 
agcacctgga gggaccctgg agagtctcta 
tctcttctct tccctgaggg gctcctctct 
taaaatttac ataacaaaat tcgccattaa 
tagtccattc acaaagtgct gcaaccatca 
aaaggaaacc ctgtgtcctt taaacacttg 
taatcactca cctgcattct ctctctatgg 
ggaatcatac aatatgtgac cttttgtgtc 
acattcatct gtgttgtgtt gtagcatgta 
gatattccat tgtaaaacac tacatttttt 
gctattgtga gtagtgttgc tgtggacatg 
ttcagttatt tggggtatac acctaggagt 
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ccagtgcccc agcactgggc tcttgccttt 6360 

agtgtcggtt agtgctgggc agcatctgac 6420 

gtgctcagaa cacgtggatc cagcaagtgc 6480 

tgcagggatg ggatggagca aaagaccaca 6540 

gggagagtca gacaataaat gtaataatta 6600 

agtgtcgtga agaaaggaag ggacagcaga 6660 

gcagttttaa atcaggagtc aggaaagggc 6720 

gaagggagtg agggagtggg tgatgtggtc 6780 

acagcaagtg caaagatccc agggcagaac 6840 

gagaaggaga cacagaccat agctccatgg 6900 

ggggagtgag ctcctcttgg tctccaactc 6960 

cctttaaaaa aaaatttttt ttaattgtgg 7020 

ccactttaaa ctgtacagtt cagtggcctt 7080 

tctctagttc caaacatttt catcactcca 7140 

ctccccattt atccccccaa gtccccttgg 7200 

atttgcctat cctggatatt tcatataaat 7260 

tggcttatct cactaagcac agcgttttca 7320 

tcagtacttc attccttttc acagcagaat 7380 

ttatccattc attagtttat aggccttttg 7440 

tgcatacgag tatttattag aatacctgtt 7500 

agaattactg ggtcacatgg taattctgtt 7560 
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taattttctg aagaaccatc aaggtgatct ccacgggggc tgcaccattt ccaccagtaa 7620 

tgtaccaggg tcccaatttc tctacatcct tttcaatgct tgttattttc tggtgttttt 7680 

ttttttcccc cccagtgtgg ccatcttact ggatgtgaag tggtatctca tggttttaat 7740 

ttgcatttac ctaatggcta attaacactg aggatctttt catgtgctga ttggctattt 7800 

gtatatgtca tttggagaaa tgtttattca agtcctttgt ccatttttaa aattggcttg 7860 

tctttttgtt gagttgtagg gttctttata tattctggat attatttaat ttgtaaataa 7920 

ctcctcccat tctgtgggtt gtcttttttt tgatagtgtc ctttgatgca caaaaatttt 7980 

agttttgctg aagtccaatt tatctttttt tccttttctt taggtgtcat atctaagaat 8040 

ccattgccaa acccaaggtc atgaaggttt accgcatgtg ttttcttcta agagttttat 8100 

agttttcact tatatttagg ccttgataaa ttttgagtta atttttgtat atgtgtgagg 8160 

caagtccaac ttcattgttt tgtactcaga tatccagtta tcccagcacc atttgttagg 8220 

ctgtttttcc cctgttgaat ggtcttggta cctttgtaga aaatcaactg gccatagatg 8280 

tatggattta tttctagact ctcaattcta ttcatttttt tggtttgttt gtttaagaaa 8340 

gggttgcatt ctttcgacag cccaggctgg agtacggtgg ctccatcttg gctcactgca 8400 

acctccgtct cctgggttca agcaattctc ccatctcagc ctcccaggta gctgggacta 8460 

caggcgtgtg ctaccatgcc tggctaattt ttgtgtttct tggtagagat ggggtttcac 8520 

catgttggct aggctggtcc tgaattcgtg acctcaagtg atttgctcac ctcggcctct 8580 

caaagtactg ggattacagg catgtgtgag ccactgcgcc cagccaattc tattcatttg 864 0 

atctatatgt caataccaca ctattttggt actgttactg tggcttactg tggttattgt 8700 

ggctttggag caaattttga aattccagat tgtgaggcct ccaactttgt tctttttttt 8760 
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tttttgagac gcagtctcgc tttgtcgcct atgctggagt gcaatggcgc gatctcggct 8820 

cactgcaacc tccgccttct ggtttcaggt gattctcctg cctcagcctc ccgagtagct 8880 

gggattacag gcgcccggca ccacgcctag ctaatttttc tatttttagt agagatgagg 8940 

tctcaccatg ttggtcaggt tggtctcaaa ctcctgacct catgatctgc ctgcctctgc 9000 

ctcccaaagt gctgggatta caggcatgag ccaccgtgcc cagccaactt tgttcttttt 9060 

taagatcgtt ttggctgttt gaggtccctt gagattccat gtgaattata gcatcaactt 9120 

ccattttttg caaaaaaggc cattgggatt ttgacaggaa ttgcattgag taaattgctt 9180 

tggggagttt tgccatctta acaatattcg gtctttcaat ccatgaacat gggatgtctt 9240 

tccgtttatt tatgtcttta atttctttca gcaatgtttt gtagctttca atggacaaat 9300 

cttgcacctc ttggttaaat ctattcccat gcattttatt cttttcgatg ttattataaa 9360 

tgaaattgtt tgaatttcct tttaagattg ttcattgctg gtatatacaa taatcagttg 9420 

tatagaaata caactgattt ttttgtgttg atcttgtatc ctacaacttt gctgaatttg 94 80 

tttcttagca tttttttctt tttttttttt tttttttttt ttttagacag agtctctctc 9540 

tgttaccagg ctggagtgca gtggcatgat ctcggctcac tgcaacctcc gcctcccagg 9600 

ttcaagcgat ttttctgcct cagcctccca agtagctggg actgcaggtg catgccacca 9660 

tgcccagcta atttttgtat ttttagtaga gatggggttt cgccatgttg gccagtgtgg 9720 

tctcgatctc ttgacctcgt gatctgccca cctcggcctc tcaaagtgct ggtattacag 9780 

gcatgagcca ctgcgcctgg cctgtttctt agctttaata gttgtgtgtg tgtgtgtgtg 9840 

tgtgtgtgtg tgtgtgtgtg tgtgtgtatt ctttaggatc ctctatatat aacatcatac 9900 

cgtctgtgaa gagaggtagc ttcctttcca atttggatgg cttttattta tttttcttgc 9960 

ctaattcctc tgattggaac ttccagtact atgttaaata gcagtagtgg agcaggcatc 10020 
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tttgtcttgt tcctgatctt agacagaggg ctttcaatat tttaccattg agtataatgt 10080 

cagctgtggg gttaaatttt ttaacgcctt ttatcatgtt gagggagttc ccttctgttc 10140 

ctagtttgtt gagtgatttt atcacaaaag gctattgaat tttgtcaaag gctttttgtg 10200 

catcaactga gagatcgtgt tttccccttc tctgcttttg ctccccttct actggtagaa 10260 

aggacccacc taaagcaagc agtgggcgcc ctagaggggt tacagcctag ctcttccctg 10320 

agagcagttc ttggtttgaa cctgagggca gcgggtccgc ctgaggaaac caggtgtctg 10380 

gaaggtgaag gcttgtggag ctgagtagat ggggcagtag gtcccagaga tatggccagc 10440 

cccagtcatg tcctgctctc tgtggagtcc cacagaggct gacgaggtat gggggccctg 10500 

atagctggct acatgcaggc catgcccttt ggcgggtggt ggcgtcagtc tggggcagac 10560 

ctcccatgct cacatagtgt gctcattcac ccagcactgc cttaggttgg gctccctaga 10620 

atggtggctc ttaaacccca gcaagtatct gaaacactgg agggcttgtt ccagcagatg 10680 

gctgggcccc tcccagagtt tctgatccat gttgtcttgg gtagagactg ggaatctgca 10740 

tttctaatac attctcaagt gttgtggatg ctgctggtct gagaaccaca tccctagaag 10800 

cagagtctga gatggtgcag gcgatttcag atgaaccctg caagaggcac aggcagtggg 10860 

gagcgggcag agtgagcagc tgagcacaga tgtggatttg gaagtgtggc ctcagcctga 10920 

ttccatggag atctctgggg cgtgaatgtc accacagggt tgccctgccc agaagcatgt 10980 

ggcctggctg ttacaggccc ttgtcagtca tggctctcct gggatgatgc aggtgaggtg 11040 

gcttctgtca ggagaagggc tctggtgcac cagccagaaa aggggatcaa cggcatgcat 11100 

ggccagcacc tactgtgtgc caggcatggc ctcagcactg tctgcacagc agtgagcaga 11160 

cgcgtgctgt cctcctggag ctggcatcct tttgagggag atagatgcta atcgggacag 11220 
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tctgtagcct cagggagaga agtgctatct 
ggggccccag gtgggagtat tttattttat 
caggctggag tgcagtggtg cgatcttggc 
agattctcct ccctcgcctc ctgagtagct 
ctaatttttg tgtttttaat ggacaccaga 
ctctggacct caagtaatcc gcctacctca 
agccaccaag cctggctggg tgtggggatt 
gactggtttc cacctctaag tcctcatcca 
cagagaagtg aattctaaat tcacatagcc 
gggaacttct gagcctgtcc accccagtcc 
cgagactatc agggagcctg acctgctgga 
gtcccagaga aggtatctgt cagcagtgca 
ctcgggggct atccctggaa gtgttggtca 
tgccctgagc tcctcctacc tgccacctcc 
gccctcttcc ctctcctccc ctcacccagg 
gtttatttct caccttggcc actgatgggt 
aaccttctgc aatgatggaa atgctcagac 
gcatgtggct cttgaaatat ggagagtgta 
tttgatgaat ttacaatcac tcgtaagtag 
tggtgctggt ctagggtgtt ggcaaccaca 
ccaggagaag gcccaagtgc cagcctcctc 
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gggaagatga agccaaggtg tgggctccag 11280 

ttttttgaga cagagtttca ctctgtcacc 11340 

tcactgcaac ctccacccct tgggttgaag 11400 

gggattacag gcacctgcca ccatgcccgg 11460 

tttcaccatg ttggccaggc tggtcgtgaa 11520 

gcctcccaaa gttctgggat tacagatgta 11580 

ttagattaga tgaggaggac aggcctctct 11640 

aagccttgtt ttatagatga gacagaggca 11700 

agtggcagaa cccagacttg gaccagtttg 11760 

tagcctcacc cacagtgccc ttgcccaggg 11820 

tctgggcagt cccaccgtgg catgctgcat 11880 

gcacccccca cctgccccac ccacagctcc 11940 

gaaagtgaat ctccagatgt cacctggtgg 12000 

tctgaccaca tagagcctgc tctagcccag 12060 

gacccgccac tagtccgccc cacccactct 12120 

ggtttctcct agagcggtgc tgccctgtgg 12180 

ctgctctgtg cagtccagtc gccactggcc 12240 

actgaggaac caaacttgaa tttttaaaat 12300 

ccacctgtgg ctggcagcca ctggattgga 12360 

tcactgcctt gtgcagaaac cactgctgca 12420 

ttcactgccc gaagcctgct gctccgctga 12480 
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ggggctcgtc tcgccaacgt tggcacagca aacacacata ctttctcctg tgggggctgg 12540 

tcctgctggc caagtcccgt gcatgctcct gggtggctgc acctggcccc tgcaccaggt 12600 

caggtccaat ctgtggagga taccaaggaa cctctttgag gttcccaagt gtgtcccatg 12660 

ccactgcagt tttgcagaag gttagtgtgt gtgacttaaa aggcaaagag ggcaggcaga 12720 

tcttctgaca tctgggggga gcaaagttag aatggaatat ttgctgcaga acttctcaga 12780 

gcctttagca tgctaggatg tgctgcaaat ctccaggagg caggcggcat aagccatgct 12840 

tcccaaacga cttgccggtg gaagcctcct tgaggagtgc tgtgcgagac ccgtggctgt 12900 

ggagcacacg agagaatgcc tttctcgtgg tttgtgtcca tgctgggctc tcggctgcat 12960 

tgtcttccag tctgtgtccc ctgctggctt cccagggagg gagggaggct gtgactccat 13020 

gtgctccttc agcggctcgt ttgtttgctc attcgttcat ggaaaaccat ggttccatgc 13080 

cagccacacg cggggcctct gccgggcagt gggatgagtg tggtgaacaa gaggagctga 13140 

tgacctcagg cagggacctt cctttctctg ggtctgtccc gcaacataca cacacgcaca 13200 

cacgcacacg gacatacctg tgcacacatg tatacacaag acacatacac acacatacat 13260 

acactcatgg gtgtgtcctg cagctgtctg gctgtgctgg tcccagctct tacactccca 13320 

ccccttccca ggccctgtga tgcctccatg ttaccgccag agggcctggg cttgtggaag 13380 

tggtgccccg tgggcacctc tccttcccga ccatgagtgg gaccctgctc actgccttct 13440 

ctaccagagt gagggagtga tgccagcttc ccccgccttc agccgccctt gccggcctgg 13500 

gctggtggcc atgggcattc cccagcagtg tgggcaggct gggtgcctgg cacccccagg 13560 

actatgacag aagcctcccc tggtggccag ggcctaagcc atgaggcccc tgctggggcc 13620 

tgacttaggg tgtgtcctgc cttttgtccg gccctgagtg gcctggctac agcacctctt 13680 
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ggccctctga ggttcgtcac ccctctgcca tcacacccat ccctggccac cctctccctg 13740 

cctgctgcct gctgtctgtc attgaacatg ctcgtgtttc tcccatccta aaactcctcc 13800 

tcctggttgg tgaacgcaat ggccacactt cccactttcc tctcatggaa tgtctgcagc 13860 

ttggtgcctc cctccacctg ctccttccag ccaccctctc tccacctggc ctcctgagca 13920 

ctgcacctta ggtctttcca catctcaccc tgtcccaggg aagcccttga tcgtccccag 13980 

gggttctctc tctgggcctt gcccttcagc atgggaagcc tgcagtccca acccagccct 14040 

tcacctccca ctctcccacc cctgttctga gctccagtct cacttaaacc tcagctgtct 14100 

cacctggctg ccccaggggc tgacttggcc catagagagc agaacctagt gccgcctctg 14160 

taccctgctt caggttcacc tccaagtgcc attaccctca caggccccag acccgacacc 14220 

tgggccctct accccttgtc cctgcatgct gcctgctaat acctgctcct cttaccaccc 142 80 

cagacccttc ttatctcatg cttcctctct agggctgcta cttctctatt cctgttcccc 14340 

taattggttc tccttgctgc agctagtgca gcttgggaca gcaccatcta tggttcccta 14400 

ctgccctgac gacaatgtgt gagcctgtgc taggagacca ggccctgtgt gataagctca 14460 

gcctgccctg ttccagctgc acccaccttc tctagatcat ggactcactt ctctgcccac 14520 

agataccttt ttcccttgac ctctgcatct ggataactcc tattcactct tcacctcctg 14580 

caaatgccat cacccccaga aagcctctct aataaccccc acccagttct cctcttcatc 14640 

accacactca tcacactgca aataagtgtc tgcaagtgtc ctggcatgag aatgggccct 14700 

ccagtgccca cctggggcac ctagcaggca cttagtaaat atttacaaag tgagtggctc 14760 

tgcctcgcgt gggtggggag cagggatgcg ttttcagcca ggagatggct tggggtttgg 14820 

gttcagctgg gcagccagtg ccatggatat ttacctggtg cacttggagg tcacagggca 14880 

cactctgtcc tgatcttagt gcagatacct ttcaggtacc gtagaccccc ccagcctcag 14940 
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cagctggaga tgagggcagt gcatcccttt tgccaggaag gtccgattcc caatggacaa 15000 

agaggcaatg cagtgcgagg gagccagagg ccagggctcc cgtcccagct ctgtcagtga 15060 

ctcattgtgt ggccttggga agatcctcgc tgcctaggcc tcagtgtccc cttctgtaca 15120 

gtgggtggtc tagactaatt tgttatccca aagcagtcct agacctgcac tgctgacttg 15180 

gagccctctg cacctcctgt tctgggcaca agagggcagc caagggcctc agaacgctga 15240 

ggaaccctgg ccaactagct ttaagaaatg cattgtgtaa actgctcttt actgagccca 15300 

gagcttgcca ggagcctggt agggttgtgg ctctggctct catttctacc aaaggaagtg 15360 

tgcttgacca gggagttcat ccaagggcac ctggaaactg tcctcaaggc atttcccggg 15420 

gaaccaattt ctcacgggtt gcctcagggt ggggaagcgg aggccaacag cccctgtctt 15480 

tttccgcagt ggtcctttgc tgttgctacc atcacagaaa tcccccccgt tatcttcctc 15540 

cccaacttcc ttgtgcagag gaaggtgctg aggccccttc ggacccagac aggaggaacc 15600 

ataatggtag gtggggtggg ggggcatggc tgggctgggg gcccccacac cccagggtcc 15660 

ttctcacctc ctttgccctg gaatgccctc ctcccactta gtagttgaac agaatcctaa 15720 

atattcctca aggctcggca acaatgaccc tttctccaaa agcctttttt ccccatcttg 15780 

ggacatcaga attctcttct catcgttcct tctcctatga cctcctattt gttaccgtaa 15840 

ttgctagtat ataatatacc tctccaccca ccaaagcgga tatcctagca ctatggcttt 15900 

aaggcacacc ccctcaccag ttttttcttt ctttctttct tttttttttt tgagtagagt 15960 

ctcgttctgt cgcccaggct ggagtgcagt ggtgtgatct tggctcactg caacctctgc 16020 

ctcctaggtt caagcgattc tcttgcctca ggctcctgag tagctgggac tacaggtgtt 16080 

cgccaccatg cctggctaat ttttgtattt ttagtagaga cggggtttta ccatgttggc 16140 
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caggctggtc ttgaactcct gacctcaaat gatccactca ccttggcctc ccaaagtact 16200 

gggattacag gctttgagcc accatgccca gccctaatgc acccaaaatt aagatggaga 16260 

actgatcctc catgacttca gtgatgaata agcctccacg tctcccccac tgcgggtgtg 16320 

gcaacaaaga atccccacag caaaattagg tttcacattg tgtgtgtggt ttttttaaaa 16380 

aaatgtgcca cacactgcct agttatttgg agatagagga atgtttcaca tgcaaatgta 16440 

tgaggatcta acccagccct ggatcactac ctactgatcc cctacagttc tgttatgttt 16500 

gtaaatttgt actttttcct ttagcttagt agaatattac tgcccatccc caaaactatg 16560 

atttcctgga agatttcagt atttagtcta ctatatttct ttttttgctt tttttttttt 16620 

tttttttgag acagagtctc actctgtcct ccaggctgga gtgcaggggt gtgaccttgg 16680 

ctcactgcaa cctctgcctc ctgggttcaa gtgattctcc tgcctcagcc tcccgagtag 16740 

ctgggattac aggcacacgc cactctgcct ggctaatttt tgtattttta gtagagacgg 16800 

ggtttcacca tgttggcgag gctggtcttg aactcctgac ctcaagtgat ccgcctgcct 16860 

cggcctccca aagtgctggg attacaggcg tgagccactg cgcctggccc agtctactgt 16920 

atttctgtga gcaaaacttt gcctattttc cctttgaaag ccatatcaaa attattgtca 16980 

gctcatatgt gatggatgat aagtactttt attttttcca gtttccttgc acaatttcaa 17040 

aggtgcttat gcactgtaca tctcatatgc cagccaagct ggcacttact tcctggactg 17100 

ttgcttgggg tagggagttc cttctatacc cctgccttgt agctcagctc atccttcccc 17160 

cagagctggc tagaagcagt gtttatggaa tgagtgcatg aatcagtgaa tgaatgactg 17220 

gtggatcggc tgcctgcgcc ccctcaccct ctgcttgtct ccaaaggcgg ggaagctggc 17280 

tgtggagcga ggctgggcca tcaacgtggg tgagtgctgg gaatgtcctc gggaatgtcc 17340 

agcccggctt ggtggaactg gcctgaaagg gggctggggg agggcgggag gatcctggag 17400 
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gtggcagctg tgaattcaga agctctggtt ttcccaagtc accctagcct ccttgtggag 17460 

tggcctggag gttgatgtgt agcctcctag gtacctggga gagactgacc agtgcctcca 17520 

tctgacgtgg gatccttgtc taaggaggtc cccgggtggt tccccagccc cctctttgcg 17580 
tacttccggt ggcagggagc ttcctccctt ccagagagcg tgtgccatcc ttgggcagct . 17640 

cagcatggtc tgaagcctgc cttgtgtctt ccctgaagga ctccacctgt gtcctggggc 17700 

ccaggacagc ccacagaggc ttggtcatgt tgggttgggt gggcacatcc tgggtcaata 17760 

ccaccacctt ctcaagggtc cagagggccc gtgctcccca gcccccttga atctcccaca 17820 

agattggctc atgggagggc tgcacgggag tctcccttgt ccctgtcatt gtccctcctg 17880 

gaggcacagc acttgacaat ttacaaagct ctttttcacc aggctctttt tttctttttc 17940 

gagacgtagt ttcactcttg ttgcccaggc tggagtgcaa tggcgcgatc tcggctcacc 18000 

gcaacctccg cctcccaggt tcaaacaatt ctcctgcctc agcctcctga gtagctgaga 18060 

ttacaggcat gcaccaccat gcccggctaa ttttgtattt ttagtagaga cagggtttct 18120 

ccatgttggt caggctgggt cttgaactcc cgacctcagg tgatacgccc acctcgctcg 18180 

gcctcccaaa gtgctaagat tacagacatg agccaccacg cccggccttc acccagactc 18240 

ttatttgagc tgggcataat tgtcaggcct gtctcactga tgaggaaatg gccatggaaa 18300 

gatgcgtact ggatcgtgta gagccctaaa gcagggtccc ccagcctttg gctctgaact 18360 

ctgcagggga gagtccacct tgggccactg cacagttgag gggagcccca ctctgcaggg 18420 

gctgggtctc ttccatcttg gtattaccag gtgcctagca ttcagtctgg catagtaatg 18480 

atgttatggt actctgctgc acaaacccgg gagtgatctg tgccctgcgt gtctacagca 18540 

gggttccgag gagggcctgg atggccctcc ccatggcagg tgttactgcc tggtagaggt 18600 
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taagagcctg gatcctgatc caccctgggt ttgatcctgg ttctgccatt acctggctgt 18660 

gtgaccctgg gcaagttgct gacctcctct gtgggtcagt ctcctcatct gtaaaatggg 18720 

gatggtgatg ctaatgcccc tcctcgggct ggagggagtc ttcagcaagc tcagttgctc 18780 

agtcaggtgt tcactgtggc tgtcttctca tcattaggag ccaacagtag cctcctgggg 18840 

ggtgggagag gcaagttcct ggtatccatg gggccagctg cacactgtct gacggagcag 18900 

ttgttgggct caatttcaga gggcctctgc aattcaggcc atcccagggg ctgcagggga 18960 

gggggtatct atgggcccta gggctctgag gctgtgtctc agggttgagg ggtgatggat 19020 

cccgggctct agggccctcc tcgtggctgt aggcagtcat gaccagcaga gggtgccctt 19080 

cctgaccacc cgctttggcc actggcagaa tccgtgtggc ccccatacca ccactccttc 19140 

ctggagtggg gagccacatg gagccaggcc cagcttggtg gggacaagga gcagctttct 19200 

gcttctggaa tgatgagcta tctgttgctt aggggtgtga gtggcactga ggacttgctg 19260 

gggacaccct gaagatgtgg ctgccttctg gcctggggat ggtgacatgc cccagcactc 19320 

agcttagttt gccaacccag agtccgaggc acaggttcct gagagctgag cagggaggat 19380 

gctgggggag gtgaagggat ggaggagctc ctggactgag cctgggagcc tggctctgag 19440 

cagcaccgct ctctgccctt ccgcaggggg tggcttccac cactgctcca gcgaccgtgg 19500 

cgggggcttc tgtgcctatg cggacatcac gctcgccatc aaggtgtgtc tatgagcaag 19560 

tggggtctcg cctccaagag ccctcctgga atcctcccca tagctccaaa ttaactgttc 19620 

tcaccctgaa ttatagacaa ggggcctatg ctggagcagg gagggggctt gtttgggttg 19680 

ctcagccagg ctggaactga atccagatct gacacttgct cctcttccat gttgcttaga 19740 

agggttgcct gtggtggaag ggagttattc cagcctccca cagagccagg ggactagaga 19800 

gggtcaggat ctgctgtata gccacatatt aagttgtagg aagaagggca tggctggcaa 19860 
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agggagtagg gagtggaaag aatgatggtg ctgatagcac ctggcagttc tgcatgctcc 19920 

aacccgcgct gtgctccagg acttactccc tgaatcctcg cagacagaca ggggcccaca 19980 

gaggtgaggg catgcaaata gcaggggcag aattggcgct ggcctctggt ctgtggggcc 20040 

ccacaactcc cctgccactc tgtgcctggc cttgtgctgg gcatcaggaa ctgactgacc 20100 

tgttcctatg tgtgcctgct ctcatggggc acatagactg atggggggaa gcaggccatt 20160 

aggagaaggg ggaagcacag gagaccttcc tggggaggag ggaatgaagg cttcctggaa 20220 

gagggggcat ttaggacttg gccttgtagg ataaggcaga ggttggggac tgaagtccca 20280 

gggctgtggg gattctctcc ttaaccccta cacatttcct agggaatctg ggaaaatcca 20340 

gggcctgagt gacccactta cctcctgacc tatgaccctt cagggcacag gacatgcccc 20400 

ctcctccagg gagccttccc tgaccacctc ctgcatgcac acatggagcc ccacagctgg 20460 

agctgcacag ctctccctgg caagtgacat ctttgctggg tggcctgatt acccacaagc 20520 

attaggcccc cctccccgcc cctcgccagc cagctgggag ttgctgtagg gctgggtcct 20580 

ctgtccgccc cagatcctca tgtctaccct ctcctccctg gcagtttctg tttgagcgtg 20640 

tggagggcat ctccagggct accatcattg atcttgatgc ccatcaggtg agtgccctgc 20700 

aggggctgga ctcttagggg acctgccacc cccagttcca gaatcttccc ggggcaggag 20760 

agtctccctc ctcatgtccc cacggctctc acggcttctg tcttctgtct ctcgggctac 20820 

aaatgcaggg tctgtctttg tcactctgtc caggacagcg ggtcctcctc attgctcccg 20880 

agggtcctcc ctccctcctc ctgactgccc ccacatgagg ctcttcctga agcccactct 2094 0 

gatgggactg ctctcgtgtg cagagctctg ctgtgggtcc ccattgctta tgaataattt 21000 

ggggcactgc cccctgccca gagctgctga gcactggcca cctgcccctc aggcggatgc 21060 
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ccacacacat ggcttggctc gggcacctgg 
agtaaagtgt caaagcagag ggttacctcc 
ctggtcagac agggagggga cccagtgggc 
ctgtggcttt gtgctagttg gggccctgcc 
caggtcacat gtggacagtc ctttacagtt 
ccaccacaga cttgagaggg tggcagagcg 
gcactgggct tgctcaagcc catgctagaa 
ccagctctgc cccttaacag tcatgagacc 
tggcaaatgg ggagtttcct gaggggtggg 
cacctacccg agagcggcta ctgtgacctc 
catggacgac aagcgtgtgt acatcatgga 
ccgctttgcc aagcgtaagc tgctgcccct 
aggctctctc ctgagtgtct cctgtctgct 
tagcatccct gtgaggtgat cctttccatt 
gtcactcgac ccacccaaga tcacataacc 
aaacaggaaa gaatgaaaga aaaaaaagaa 
gaaataaatt cacataggct gggcgcggtg 
ggctgaagtg ggcggatcac ctgaggtcgg 
aaaccccatc tctactaaaa atacaaaatt 
ccagctactc gggaggctga gacaggagaa 
taagccgaga tcgcgccatt gcactccagc 
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ggtcaccatt taagaactcg gcgcctaggg 21120 

tcctcaggac ccctaatgag gccagtgcct 21180 

tccggaaggc acccccctgc accattactg 21240 

ttgggttctt gcgaccccga actcctgagc 21300 

tgcttttcac atccctgatc ccaaccagtc 21360 

ggatttcttc ctctgatagg gaacctaaga 21420 

ggtgtcgggg cctggtttta aggttgaatc 21480 

tgctgccccc gagagcaggc cgtgctgccc 21540 

tgggtggcag agccccagcc ttgcctaggg 21600 

cccacagggc aatgggcatg agcgagactt 21660 

tgtctacaac cgccacatct acccagggga 21720 

accctcatct tgggtgtgtc cttgtggatg 21780 

aggccctgca gaagccactg cagtggttca 21840 

ttacagatga ggaaaccgag acctggagaa 21900 

cttacaataa acatgcattt gtctggcaaa 21960 

aaataggata aatttgaaaa tacgaaataa 22020 

gctcacgcct gtaatcccag cactttggga 22080 

gagtttgaga ccagcctgac caacatggag 2214 0 

agctggatgt ggtggcgcat gcctgtaatc 22200 

ttgcttgaac ctgggaggcg gaggtttcgg 22260 

ttgggcaaca agagcgaaac tccatctcga 22320 
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aagaaagaaa gaaattcatg tataatcgtt aaaatgaaaa tgcattaaac tcatcaatca 22380 

aaaggcagag actctcagat gagatttaaa aacagggctg ccacctttgc aggtagggga 2244 0 

catttttgca ccagtcacga tgagtctggt gtggataagt cagcagctag tatggcccaa 22500 

ggaaccaatt tctgaacaga acctcacatg tgctgagcct gggcttaagg gcagggcagg 22560 

gtgtccatgt gtgtaggcaa gacccagagg aggcagtgaa atctgacatt gccgacacag 22620 

atctccacac ccccagggca gtgtctcagc ttcagtgccc cttctctcct ttgagtcccc 22680 

ctttttgcag ctcttggtgc tcttttcacc ttagttttgg gtggaatgag gctgagcagt 22740 

gctgaatctg acagaccagt ttccagtctt gcctggtgtc cacagtcttg tcctgagcct 22800 

cagtttccct tctctataaa ttgaggccat ccatgtctct ctcccagagg ccatcaggcg 22860 

gaaggtggag ctggagtggg gcacagagga tgatgagtac ctggataagg tggagaggaa 22920 

catcaagaaa tccctccagg agcacctgcc cgacgtggtg gtatacaatg caggcaccga 22980 

catcctcgag ggggaccgcc ttggggggct gtccatcagc ccagcggtac gtcctgaccc 23040 

ttggggccac gggagggtct gctctatgga ctcagcagca gcaggaaagg tgggcggcct 23100 

catgtcaggg aggagatgga ctgaagcaac agcagtttgg agcagggcta gccctgcagc 23160 

aggacttcct gacaccatgg gggtctggcc tgcctgagtc accctcctct tcccctaaca 23220 

gggcatcgtg aagcgggatg agctggtgtt ccggatggtc cgtggccgcc gggtgcccat 23280 

ccttatggtg acctcaggcg ggtaccagaa gcgcacagcc cgcatcattg ctgactccat 23340 

acttaatctg tttggcctgg ggctcattgg gcctgagtca cccagcgtct ccgcacagaa 23400 

ctcagacaca ccgctgcttc cccctgcagt gccc 23434 
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